问题:AutoMLSearch with EvalML在运行Autopipe算法时运行失败并返回错误。以下是可能出现的原因和解决方案:
1.错误类型:ValueError:输入数据包含NaN值。
解决方案:清除NaN值或填充缺失值。
示例代码:
from evalml.automl import AutoMLSearch
from evalml.problem_types import ProblemTypes
from evalml.preprocessing import load_data
from evalml dask_backend import DaskBackend
from evalml.utils import infer_feature_types
data = load_data('https://archive.ics.uci.edu/ml/datasets/heart+Disease', target='target')
# 将字符串列转换为分类特征类型
data = infer_feature_types(data, {'thal': 'Categorical', 'ca': 'Categorical'})
# 删除含有NaN值的行
data = data.dropna(axis=0)
# 使用DaskBackend进行自动机器学习搜索
backend = DaskBackend(n_workers=-1, threads_per_worker=1)
automl = AutoMLSearch(X_train=data.drop(columns='target'), y_train=data['target'],
problem_type=ProblemTypes.BINARY, dask_client=backend)
automl.search()
2.错误类型:ValueError:二进制分类目标必须包含两个唯一值。
解决方案:检查目标列是否只包含两个不同的值。
示例代码:
from evalml.automl import AutoMLSearch
from evalml.problem_types import ProblemTypes
from evalml.preprocessing import load_data
from evalml dask_backend import DaskBackend
from evalml.utils import infer_feature_types
data = load_data('https://archive.ics.uci.edu/ml/datasets/heart+Disease', target='target')
# 将字符串列转换为分类特征类型
data = infer_feature_types(data, {'thal': 'Categorical', 'ca': 'Categorical'})
# 删除含有NaN值的行
data = data.dropna(axis=0)
# 将目标列转换为二进制值
data['target'] =