admin管理员组

文章数量:1130349

Py之featuretools:featuretools库的简介、安装、使用方法之详细攻略

目录

featuretools库的简介

1、featuretools三大功能

2、featuretools的三大优点

3、为什么使用featuretools?

3、featuretools的原理

4、featuretools带来了过多的特征→亟需确定重要的特征→特征降维

featuretools库的安装

featuretools库的使用方法

1、基础案例

ML之FE:基于单个csv文件数据集(自动切分为两个dataframe表)利用featuretools工具实现自动构造特征/特征衍生

ML之FE:基于load_mock_customer数据集(模拟客户)利用featuretools工具实现自动构造特征/特征衍生

ML之FE:基于自定义数据集(银行客户信息贷款和赔偿)对比实现特征衍生(手动设计新特征、利用featuretools工具实现自动构造特征/特征衍生)

2、进阶案例

ML之FE:基于load_mock_customer数据集(模拟客户)利用featuretools工具实现自动构造特征/特征衍生

ML之FE:基于load_mock_customer数据集(模拟客户,单个DataFrame)利用featuretools工具实现自动构造特征/特征衍生

ML之FE:基于BigMartSales数据集利用Featuretools工具(1个dataframe表结构切为2个Entity表结构)实现自动构造特征/特征衍生应用案例

ML之FE:基于自定义数据集(银行客户信息贷款和赔偿)对比实现特征衍生(手动设计新特征、利用featuretools工具实现自动构造特征/特征衍生)


featuretools库的简介

            featuretools库的简介featuretools是一个执行自动化功能工程的框架。它擅长于将时间和关系数据集转换为机器学习的特征矩阵。Featuretools主要是为机器学习准备数据,它自动从时间和关系数据集创建特性。

官网:Featuretools | An open source framework for automated feature engineering Quick Start
文档:What is Featuretools? — Featuretools 1.26.0 documentation

1、featuretools三大功能

  • 深层特征综合:Featuretools使用DFS进行自动功能工程。您可以将原始数据与您对数据的了解结合起来,为机器学习和预测建模构建有意义的特征。
  • 精确处理时间:Featuretools提供了api来确保只有有效的数据用于计算,从而使您的特征向量不受常见标签泄漏问题的影响。可以逐行指定预测时间。
  • 可重用的特征基元:Featuretools附带了一个低级函数库,这些函数可以堆叠起来创建特征。您可以构建和共享您自己的自定义基元,以便在任何数据集中重用。

2、featuretools的三大优点

  • featuretools克服了人们高维空间想象的限制:它是一种强大的方法,它允许我们克服人类对时间和高维想象的限制,从多个数据表中创建许多新特性。
  • featuretools可跨多表整合:它允许我们将跨多个表的信息合并到一个单独的dataframe中,然后我们可以使用这个名称进行机器学习模型训练。
  • featuretools高效地辅助数据科学家:它构造了许多供我们使用的新特征。虽然这一过程可以自动构造新特征,但它不会取代数据科学家的位置,因为我们还要清楚如何使用这些特征。例如,如果我们的目标是预测某位客户是否会偿还贷款,那么我们要找出与指定结果相关度最高的特征。
    此外,如果我们有领域知识,则可以利用领域知识来选出特定的特征基元,或通过深度特征合成从候选特征中得到种子特征。

3、为什么使用featuretools?

            改进您现有的工作流程功能工具与您已经使用的工具一起构建机器学习管道。您可以加载pandas数据文件,并自动创建有意义的功能,而这只需要手动操作的一小部分时间。

3、featuretools的原理

        featuretools特性工具,基于深度特性合成的思想,将多个简单的Primitives(聚合和转换)叠加起来创建新的特性。深度特征合成将包含了表间一对多关联的“聚合”特征基元依次叠加,“转换”函数被用于单张表中的一列或多列数据,以此来从多张表中构造新的特征。

4、featuretools带来了过多的特征→亟需确定重要的特征→特征降维

       创建所有这些特性后的下一步是确定哪些特性是重要的。自动特征工程解决了一个问题,但也制造了另一个问题:特征过多。
       虽然在拟合模型前我们很难说哪些特征是重要的,但肯定不是所有特征都与目标任务相关。而且,特征过多可能会导致模型性能很差,因为不那么重要的特征会影响到那些更重要的特征。“维度的诅咒”可以通过特征降维(也被称为特征选择)来减轻,这是一个剔除不相关特征的过程。目前有多种途径可以实现:

  • 主成分分析 (PCA)
  • SelectKBest
  • 使用模型的特征重要性
  • 使用深度神经网络来自动编码


featuretools库的安装

pip install featuretools
pip install -i https://mirrors.aliyun/pypi/simple featuretools

pip install -i https://mirrors.aliyun/pypi/simple featuretools==1.25.0

C:\Windows\System32>pip install -i https://mirrors.aliyun/pypi/simple featuretools==1.25.0
Looking in indexes: https://mirrors.aliyun/pypi/simple
Collecting featuretools==1.25.0
  Downloading https://mirrors.aliyun/pypi/packages/2e/13/2d13952699114634a8de19ec35277291d831720ac48f645c8cb372cbc803/featuretools-1.25.0-py3-none-any.whl (597 kB)
     ---------------------------------------- 597.9/597.9 kB 1.6 MB/s eta 0:00:00
Requirement already satisfied: cloudpickle>=1.5.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (2.2.1)
Requirement already satisfied: dask>=2022.11.1 in d:\programdata\anaconda3\lib\site-packages (from dask[dataframe]>=2022.11.1->featuretools==1.25.0) (2024.2.1)
Collecting distributed>=2022.11.1 (from featuretools==1.25.0)
  Downloading https://mirrors.aliyun/pypi/packages/3d/74/6d08be57bc06ddefd6fe9cf09f322e1c1105da0ae2264145600312d72099/distributed-2024.2.1-py3-none-any.whl (1.0 MB)
     ---------------------------------------- 1.0/1.0 MB 1.4 MB/s eta 0:00:00
Requirement already satisfied: holidays>=0.13 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (0.21.13)
Requirement already satisfied: numpy>=1.21.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (1.26.4)
Requirement already satisfied: packaging>=20.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (23.1)
Collecting pandas<2.0.0,>=1.5.0 (from featuretools==1.25.0)
  Downloading https://mirrors.aliyun/pypi/packages/c2/45/801ecd8434eef0b39cc02795ffae273fe3df3cfcb3f6fff215efbe92d93c/pandas-1.5.3-cp39-cp39-win_amd64.whl (10.9 MB)
     ---------------------------------------- 10.9/10.9 MB 1.4 MB/s eta 0:00:00
Requirement already satisfied: psutil>=5.6.6 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (5.9.5)
Requirement already satisfied: scipy>=1.4.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (1.11.2)
Requirement already satisfied: tqdm>=4.32.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (4.66.1)
Requirement already satisfied: woodwork>=0.23.0 in d:\programdata\anaconda3\lib\site-packages (from woodwork[dask]>=0.23.0->featuretools==1.25.0) (0.29.0)
Requirement already satisfied: click>=8.1 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (8.1.7)
Requirement already satisfied: fsspec>=2021.09.0 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (2023.9.1)
Requirement already satisfied: partd>=1.2.0 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (1.2.0)
Requirement already satisfied: pyyaml>=5.3.1 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (6.0.1)
Requirement already satisfied: toolz>=0.10.0 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (0.12.0)
Requirement already satisfied: importlib-metadata>=4.13.0 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (7.0.1)
Requirement already satisfied: jinja2>=2.10.3 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (3.1.2)
Collecting locket>=1.0.0 (from distributed>=2022.11.1->featuretools==1.25.0)
  Downloading https://mirrors.aliyun/pypi/packages/db/bc/83e112abc66cd466c6b83f99118035867cecd41802f8d044638aa78a106e/locket-1.0.0-py2.py3-none-any.whl (4.4 kB)
Requirement already satisfied: msgpack>=1.0.0 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (1.0.2)
Requirement already satisfied: sortedcontainers>=2.0.5 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (2.4.0)
Requirement already satisfied: tblib>=1.6.0 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (1.7.0)
Requirement already satisfied: tornado>=6.0.4 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (6.3.3)
Requirement already satisfied: urllib3>=1.24.3 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (2.0.5)
Collecting zict>=3.0.0 (from distributed>=2022.11.1->featuretools==1.25.0)
  Downloading https://mirrors.aliyun/pypi/packages/80/ab/11a76c1e2126084fde2639514f24e6111b789b0bfa4fc6264a8975c7e1f1/zict-3.0.0-py2.py3-none-any.whl (43 kB)
     ---------------------------------------- 43.3/43.3 kB 1.1 MB/s eta 0:00:00
Requirement already satisfied: PyMeeus in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (0.5.12)
Requirement already satisfied: convertdate>=2.3.0 in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (2.4.0)
Requirement already satisfied: hijri-converter in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (2.2.4)
Requirement already satisfied: korean-lunar-calendar in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (0.3.1)
Requirement already satisfied: python-dateutil in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (2.8.2)
Requirement already satisfied: tzdata in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (2023.3)
Requirement already satisfied: pytz>=2020.1 in d:\programdata\anaconda3\lib\site-packages (from pandas<2.0.0,>=1.5.0->featuretools==1.25.0) (2023.3.post1)
Requirement already satisfied: colorama in d:\programdata\anaconda3\lib\site-packages (from tqdm>=4.32.0->featuretools==1.25.0) (0.4.6)
Requirement already satisfied: scikit-learn>=1.1.0 in d:\programdata\anaconda3\lib\site-packages (from woodwork>=0.23.0->woodwork[dask]>=0.23.0->featuretools==1.25.0) (1.3.0)
Requirement already satisfied: importlib-resources>=5.10.0 in d:\programdata\anaconda3\lib\site-packages (from woodwork>=0.23.0->woodwork[dask]>=0.23.0->featuretools==1.25.0) (6.1.2)
Requirement already satisfied: zipp>=0.5 in d:\programdata\anaconda3\lib\site-packages (from importlib-metadata>=4.13.0->dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (3.7.0)
Requirement already satisfied: MarkupSafe>=2.0 in d:\programdata\anaconda3\lib\site-packages (from jinja2>=2.10.3->distributed>=2022.11.1->featuretools==1.25.0) (2.1.3)
Requirement already satisfied: six>=1.5 in d:\programdata\anaconda3\lib\site-packages (from python-dateutil->holidays>=0.13->featuretools==1.25.0) (1.16.0)
Requirement already satisfied: joblib>=1.1.1 in d:\programdata\anaconda3\lib\site-packages (from scikit-learn>=1.1.0->woodwork>=0.23.0->woodwork[dask]>=0.23.0->featuretools==1.25.0) (1.3.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in d:\programdata\anaconda3\lib\site-packages (from scikit-learn>=1.1.0->woodwork>=0.23.0->woodwork[dask]>=0.23.0->featuretools==1.25.0) (3.2.0)
Installing collected packages: zict, locket, pandas, distributed, featuretools
  Attempting uninstall: zict
    Found existing installation: zict 2.0.0
    Uninstalling zict-2.0.0:
      Successfully uninstalled zict-2.0.0
  Attempting uninstall: locket
    Found existing installation: locket 0.2.1
    Uninstalling locket-0.2.1:
      Successfully uninstalled locket-0.2.1
  Attempting uninstall: pandas
    Found existing installation: pandas 2.2.1
    Uninstalling pandas-2.2.1:
      Successfully uninstalled pandas-2.2.1
  Attempting uninstall: distributed
    Found existing installation: distributed 2022.2.1
    Uninstalling distributed-2022.2.1:
      Successfully uninstalled distributed-2022.2.1
  Attempting uninstall: featuretools
    Found existing installation: featuretools 1.30.0
    Uninstalling featuretools-1.30.0:
      Successfully uninstalled featuretools-1.30.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
alibi 0.9.2 requires Pillow<10.0,>=5.4.1, but you have pillow 10.2.0 which is incompatible.
arviz 0.15.1 requires xarray>=0.21.0, but you have xarray 0.20.1 which is incompatible.
llama-index 0.8.59 requires urllib3<2, but you have urllib3 2.0.5 which is incompatible.
ludwig 0.7.4 requires jsonschema<4.7,>=4.5.0, but you have jsonschema 4.19.0 which is incompatible.
ludwig 0.7.4 requires psutil==5.9.4, but you have psutil 5.9.5 which is incompatible.
ludwig 0.7.4 requires scikit-learn<1.2.0, but you have scikit-learn 1.3.0 which is incompatible.
ludwig 0.7.4 requires transformers<4.22,>=4.10.1, but you have transformers 4.33.2 which is incompatible.
streamlit 1.24.0 requires importlib-metadata<7,>=1.4, but you have importlib-metadata 7.0.1 which is incompatible.
streamlit 1.24.0 requires pillow<10,>=6.2.0, but you have pillow 10.2.0 which is incompatible.
streamlit 1.24.0 requires protobuf<5,>=3.20, but you have protobuf 3.19.1 which is incompatible.
syft 0.8.2 requires networkx==2.8, but you have networkx 3.1 which is incompatible.
syft 0.8.2 requires numpy<=1.24.4,>=1.23.5, but you have numpy 1.26.4 which is incompatible.
syft 0.8.2 requires pydantic[email]==1.10.13, but you have pydantic 2.6.1 which is incompatible.
syft 0.8.2 requires safetensors==0.4.0, but you have safetensors 0.3.3 which is incompatible.
syft 0.8.2 requires torch[cpu]==2.1.0, but you have torch 2.0.1 which is incompatible.
syft 0.8.2 requires transformers==4.34.0, but you have transformers 4.33.2 which is incompatible.
syft 0.8.2 requires typeguard==2.13.3, but you have typeguard 4.1.5 which is incompatible.
xarray-einstats 0.5.1 requires xarray>=2022.09.0, but you have xarray 0.20.1 which is incompatible.
xport 3.6.1 requires pandas<1.4,>=1.3.5, but you have pandas 1.5.3 which is incompatible.
ydata-profiling 4.6.4 requires numpy<1.26,>=1.16.0, but you have numpy 1.26.4 which is incompatible.
Successfully installed distributed-2024.2.1 featuretools-1.25.0 locket-1.0.0 pandas-1.5.3 zict-3.0.0

[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip

featuretools库的使用方法

1、基础案例

ML之FE:基于单个csv文件数据集(自动切分为两个dataframe表)利用featuretools工具实现自动构造特征/特征衍生

https://yunyaniu.blog.csdn/article/details/115448504

ML之FE:基于load_mock_customer数据集(模拟客户)利用featuretools工具实现自动构造特征/特征衍生

https://yunyaniu.blog.csdn/article/details/115364577

ML之FE:基于自定义数据集(银行客户信息贷款和赔偿)对比实现特征衍生(手动设计新特征、利用featuretools工具实现自动构造特征/特征衍生)

https://yunyaniu.blog.csdn/article/details/112303581


 

2、进阶案例

ML之FE:基于load_mock_customer数据集(模拟客户)利用featuretools工具实现自动构造特征/特征衍生

https://yunyaniu.blog.csdn/article/details/115364577

ML之FE:基于load_mock_customer数据集(模拟客户,单个DataFrame)利用featuretools工具实现自动构造特征/特征衍生

https://yunyaniu.blog.csdn/article/details/115440698

ML之FE:基于BigMartSales数据集利用Featuretools工具(1个dataframe表结构切为2个Entity表结构)实现自动构造特征/特征衍生应用案例

https://yunyaniu.blog.csdn/article/details/115601605

ML之FE:基于自定义数据集(银行客户信息贷款和赔偿)对比实现特征衍生(手动设计新特征、利用featuretools工具实现自动构造特征/特征衍生)

https://yunyaniu.blog.csdn/article/details/112303581

Py之featuretools:featuretools库的简介、安装、使用方法之详细攻略

目录

featuretools库的简介

1、featuretools三大功能

2、featuretools的三大优点

3、为什么使用featuretools?

3、featuretools的原理

4、featuretools带来了过多的特征→亟需确定重要的特征→特征降维

featuretools库的安装

featuretools库的使用方法

1、基础案例

ML之FE:基于单个csv文件数据集(自动切分为两个dataframe表)利用featuretools工具实现自动构造特征/特征衍生

ML之FE:基于load_mock_customer数据集(模拟客户)利用featuretools工具实现自动构造特征/特征衍生

ML之FE:基于自定义数据集(银行客户信息贷款和赔偿)对比实现特征衍生(手动设计新特征、利用featuretools工具实现自动构造特征/特征衍生)

2、进阶案例

ML之FE:基于load_mock_customer数据集(模拟客户)利用featuretools工具实现自动构造特征/特征衍生

ML之FE:基于load_mock_customer数据集(模拟客户,单个DataFrame)利用featuretools工具实现自动构造特征/特征衍生

ML之FE:基于BigMartSales数据集利用Featuretools工具(1个dataframe表结构切为2个Entity表结构)实现自动构造特征/特征衍生应用案例

ML之FE:基于自定义数据集(银行客户信息贷款和赔偿)对比实现特征衍生(手动设计新特征、利用featuretools工具实现自动构造特征/特征衍生)


featuretools库的简介

            featuretools库的简介featuretools是一个执行自动化功能工程的框架。它擅长于将时间和关系数据集转换为机器学习的特征矩阵。Featuretools主要是为机器学习准备数据,它自动从时间和关系数据集创建特性。

官网:Featuretools | An open source framework for automated feature engineering Quick Start
文档:What is Featuretools? — Featuretools 1.26.0 documentation

1、featuretools三大功能

  • 深层特征综合:Featuretools使用DFS进行自动功能工程。您可以将原始数据与您对数据的了解结合起来,为机器学习和预测建模构建有意义的特征。
  • 精确处理时间:Featuretools提供了api来确保只有有效的数据用于计算,从而使您的特征向量不受常见标签泄漏问题的影响。可以逐行指定预测时间。
  • 可重用的特征基元:Featuretools附带了一个低级函数库,这些函数可以堆叠起来创建特征。您可以构建和共享您自己的自定义基元,以便在任何数据集中重用。

2、featuretools的三大优点

  • featuretools克服了人们高维空间想象的限制:它是一种强大的方法,它允许我们克服人类对时间和高维想象的限制,从多个数据表中创建许多新特性。
  • featuretools可跨多表整合:它允许我们将跨多个表的信息合并到一个单独的dataframe中,然后我们可以使用这个名称进行机器学习模型训练。
  • featuretools高效地辅助数据科学家:它构造了许多供我们使用的新特征。虽然这一过程可以自动构造新特征,但它不会取代数据科学家的位置,因为我们还要清楚如何使用这些特征。例如,如果我们的目标是预测某位客户是否会偿还贷款,那么我们要找出与指定结果相关度最高的特征。
    此外,如果我们有领域知识,则可以利用领域知识来选出特定的特征基元,或通过深度特征合成从候选特征中得到种子特征。

3、为什么使用featuretools?

            改进您现有的工作流程功能工具与您已经使用的工具一起构建机器学习管道。您可以加载pandas数据文件,并自动创建有意义的功能,而这只需要手动操作的一小部分时间。

3、featuretools的原理

        featuretools特性工具,基于深度特性合成的思想,将多个简单的Primitives(聚合和转换)叠加起来创建新的特性。深度特征合成将包含了表间一对多关联的“聚合”特征基元依次叠加,“转换”函数被用于单张表中的一列或多列数据,以此来从多张表中构造新的特征。

4、featuretools带来了过多的特征→亟需确定重要的特征→特征降维

       创建所有这些特性后的下一步是确定哪些特性是重要的。自动特征工程解决了一个问题,但也制造了另一个问题:特征过多。
       虽然在拟合模型前我们很难说哪些特征是重要的,但肯定不是所有特征都与目标任务相关。而且,特征过多可能会导致模型性能很差,因为不那么重要的特征会影响到那些更重要的特征。“维度的诅咒”可以通过特征降维(也被称为特征选择)来减轻,这是一个剔除不相关特征的过程。目前有多种途径可以实现:

  • 主成分分析 (PCA)
  • SelectKBest
  • 使用模型的特征重要性
  • 使用深度神经网络来自动编码


featuretools库的安装

pip install featuretools
pip install -i https://mirrors.aliyun/pypi/simple featuretools

pip install -i https://mirrors.aliyun/pypi/simple featuretools==1.25.0

C:\Windows\System32>pip install -i https://mirrors.aliyun/pypi/simple featuretools==1.25.0
Looking in indexes: https://mirrors.aliyun/pypi/simple
Collecting featuretools==1.25.0
  Downloading https://mirrors.aliyun/pypi/packages/2e/13/2d13952699114634a8de19ec35277291d831720ac48f645c8cb372cbc803/featuretools-1.25.0-py3-none-any.whl (597 kB)
     ---------------------------------------- 597.9/597.9 kB 1.6 MB/s eta 0:00:00
Requirement already satisfied: cloudpickle>=1.5.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (2.2.1)
Requirement already satisfied: dask>=2022.11.1 in d:\programdata\anaconda3\lib\site-packages (from dask[dataframe]>=2022.11.1->featuretools==1.25.0) (2024.2.1)
Collecting distributed>=2022.11.1 (from featuretools==1.25.0)
  Downloading https://mirrors.aliyun/pypi/packages/3d/74/6d08be57bc06ddefd6fe9cf09f322e1c1105da0ae2264145600312d72099/distributed-2024.2.1-py3-none-any.whl (1.0 MB)
     ---------------------------------------- 1.0/1.0 MB 1.4 MB/s eta 0:00:00
Requirement already satisfied: holidays>=0.13 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (0.21.13)
Requirement already satisfied: numpy>=1.21.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (1.26.4)
Requirement already satisfied: packaging>=20.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (23.1)
Collecting pandas<2.0.0,>=1.5.0 (from featuretools==1.25.0)
  Downloading https://mirrors.aliyun/pypi/packages/c2/45/801ecd8434eef0b39cc02795ffae273fe3df3cfcb3f6fff215efbe92d93c/pandas-1.5.3-cp39-cp39-win_amd64.whl (10.9 MB)
     ---------------------------------------- 10.9/10.9 MB 1.4 MB/s eta 0:00:00
Requirement already satisfied: psutil>=5.6.6 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (5.9.5)
Requirement already satisfied: scipy>=1.4.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (1.11.2)
Requirement already satisfied: tqdm>=4.32.0 in d:\programdata\anaconda3\lib\site-packages (from featuretools==1.25.0) (4.66.1)
Requirement already satisfied: woodwork>=0.23.0 in d:\programdata\anaconda3\lib\site-packages (from woodwork[dask]>=0.23.0->featuretools==1.25.0) (0.29.0)
Requirement already satisfied: click>=8.1 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (8.1.7)
Requirement already satisfied: fsspec>=2021.09.0 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (2023.9.1)
Requirement already satisfied: partd>=1.2.0 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (1.2.0)
Requirement already satisfied: pyyaml>=5.3.1 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (6.0.1)
Requirement already satisfied: toolz>=0.10.0 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (0.12.0)
Requirement already satisfied: importlib-metadata>=4.13.0 in d:\programdata\anaconda3\lib\site-packages (from dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (7.0.1)
Requirement already satisfied: jinja2>=2.10.3 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (3.1.2)
Collecting locket>=1.0.0 (from distributed>=2022.11.1->featuretools==1.25.0)
  Downloading https://mirrors.aliyun/pypi/packages/db/bc/83e112abc66cd466c6b83f99118035867cecd41802f8d044638aa78a106e/locket-1.0.0-py2.py3-none-any.whl (4.4 kB)
Requirement already satisfied: msgpack>=1.0.0 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (1.0.2)
Requirement already satisfied: sortedcontainers>=2.0.5 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (2.4.0)
Requirement already satisfied: tblib>=1.6.0 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (1.7.0)
Requirement already satisfied: tornado>=6.0.4 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (6.3.3)
Requirement already satisfied: urllib3>=1.24.3 in d:\programdata\anaconda3\lib\site-packages (from distributed>=2022.11.1->featuretools==1.25.0) (2.0.5)
Collecting zict>=3.0.0 (from distributed>=2022.11.1->featuretools==1.25.0)
  Downloading https://mirrors.aliyun/pypi/packages/80/ab/11a76c1e2126084fde2639514f24e6111b789b0bfa4fc6264a8975c7e1f1/zict-3.0.0-py2.py3-none-any.whl (43 kB)
     ---------------------------------------- 43.3/43.3 kB 1.1 MB/s eta 0:00:00
Requirement already satisfied: PyMeeus in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (0.5.12)
Requirement already satisfied: convertdate>=2.3.0 in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (2.4.0)
Requirement already satisfied: hijri-converter in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (2.2.4)
Requirement already satisfied: korean-lunar-calendar in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (0.3.1)
Requirement already satisfied: python-dateutil in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (2.8.2)
Requirement already satisfied: tzdata in d:\programdata\anaconda3\lib\site-packages (from holidays>=0.13->featuretools==1.25.0) (2023.3)
Requirement already satisfied: pytz>=2020.1 in d:\programdata\anaconda3\lib\site-packages (from pandas<2.0.0,>=1.5.0->featuretools==1.25.0) (2023.3.post1)
Requirement already satisfied: colorama in d:\programdata\anaconda3\lib\site-packages (from tqdm>=4.32.0->featuretools==1.25.0) (0.4.6)
Requirement already satisfied: scikit-learn>=1.1.0 in d:\programdata\anaconda3\lib\site-packages (from woodwork>=0.23.0->woodwork[dask]>=0.23.0->featuretools==1.25.0) (1.3.0)
Requirement already satisfied: importlib-resources>=5.10.0 in d:\programdata\anaconda3\lib\site-packages (from woodwork>=0.23.0->woodwork[dask]>=0.23.0->featuretools==1.25.0) (6.1.2)
Requirement already satisfied: zipp>=0.5 in d:\programdata\anaconda3\lib\site-packages (from importlib-metadata>=4.13.0->dask>=2022.11.1->dask[dataframe]>=2022.11.1->featuretools==1.25.0) (3.7.0)
Requirement already satisfied: MarkupSafe>=2.0 in d:\programdata\anaconda3\lib\site-packages (from jinja2>=2.10.3->distributed>=2022.11.1->featuretools==1.25.0) (2.1.3)
Requirement already satisfied: six>=1.5 in d:\programdata\anaconda3\lib\site-packages (from python-dateutil->holidays>=0.13->featuretools==1.25.0) (1.16.0)
Requirement already satisfied: joblib>=1.1.1 in d:\programdata\anaconda3\lib\site-packages (from scikit-learn>=1.1.0->woodwork>=0.23.0->woodwork[dask]>=0.23.0->featuretools==1.25.0) (1.3.2)
Requirement already satisfied: threadpoolctl>=2.0.0 in d:\programdata\anaconda3\lib\site-packages (from scikit-learn>=1.1.0->woodwork>=0.23.0->woodwork[dask]>=0.23.0->featuretools==1.25.0) (3.2.0)
Installing collected packages: zict, locket, pandas, distributed, featuretools
  Attempting uninstall: zict
    Found existing installation: zict 2.0.0
    Uninstalling zict-2.0.0:
      Successfully uninstalled zict-2.0.0
  Attempting uninstall: locket
    Found existing installation: locket 0.2.1
    Uninstalling locket-0.2.1:
      Successfully uninstalled locket-0.2.1
  Attempting uninstall: pandas
    Found existing installation: pandas 2.2.1
    Uninstalling pandas-2.2.1:
      Successfully uninstalled pandas-2.2.1
  Attempting uninstall: distributed
    Found existing installation: distributed 2022.2.1
    Uninstalling distributed-2022.2.1:
      Successfully uninstalled distributed-2022.2.1
  Attempting uninstall: featuretools
    Found existing installation: featuretools 1.30.0
    Uninstalling featuretools-1.30.0:
      Successfully uninstalled featuretools-1.30.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
alibi 0.9.2 requires Pillow<10.0,>=5.4.1, but you have pillow 10.2.0 which is incompatible.
arviz 0.15.1 requires xarray>=0.21.0, but you have xarray 0.20.1 which is incompatible.
llama-index 0.8.59 requires urllib3<2, but you have urllib3 2.0.5 which is incompatible.
ludwig 0.7.4 requires jsonschema<4.7,>=4.5.0, but you have jsonschema 4.19.0 which is incompatible.
ludwig 0.7.4 requires psutil==5.9.4, but you have psutil 5.9.5 which is incompatible.
ludwig 0.7.4 requires scikit-learn<1.2.0, but you have scikit-learn 1.3.0 which is incompatible.
ludwig 0.7.4 requires transformers<4.22,>=4.10.1, but you have transformers 4.33.2 which is incompatible.
streamlit 1.24.0 requires importlib-metadata<7,>=1.4, but you have importlib-metadata 7.0.1 which is incompatible.
streamlit 1.24.0 requires pillow<10,>=6.2.0, but you have pillow 10.2.0 which is incompatible.
streamlit 1.24.0 requires protobuf<5,>=3.20, but you have protobuf 3.19.1 which is incompatible.
syft 0.8.2 requires networkx==2.8, but you have networkx 3.1 which is incompatible.
syft 0.8.2 requires numpy<=1.24.4,>=1.23.5, but you have numpy 1.26.4 which is incompatible.
syft 0.8.2 requires pydantic[email]==1.10.13, but you have pydantic 2.6.1 which is incompatible.
syft 0.8.2 requires safetensors==0.4.0, but you have safetensors 0.3.3 which is incompatible.
syft 0.8.2 requires torch[cpu]==2.1.0, but you have torch 2.0.1 which is incompatible.
syft 0.8.2 requires transformers==4.34.0, but you have transformers 4.33.2 which is incompatible.
syft 0.8.2 requires typeguard==2.13.3, but you have typeguard 4.1.5 which is incompatible.
xarray-einstats 0.5.1 requires xarray>=2022.09.0, but you have xarray 0.20.1 which is incompatible.
xport 3.6.1 requires pandas<1.4,>=1.3.5, but you have pandas 1.5.3 which is incompatible.
ydata-profiling 4.6.4 requires numpy<1.26,>=1.16.0, but you have numpy 1.26.4 which is incompatible.
Successfully installed distributed-2024.2.1 featuretools-1.25.0 locket-1.0.0 pandas-1.5.3 zict-3.0.0

[notice] A new release of pip is available: 23.3.1 -> 24.0
[notice] To update, run: python.exe -m pip install --upgrade pip

featuretools库的使用方法

1、基础案例

ML之FE:基于单个csv文件数据集(自动切分为两个dataframe表)利用featuretools工具实现自动构造特征/特征衍生

https://yunyaniu.blog.csdn/article/details/115448504

ML之FE:基于load_mock_customer数据集(模拟客户)利用featuretools工具实现自动构造特征/特征衍生

https://yunyaniu.blog.csdn/article/details/115364577

ML之FE:基于自定义数据集(银行客户信息贷款和赔偿)对比实现特征衍生(手动设计新特征、利用featuretools工具实现自动构造特征/特征衍生)

https://yunyaniu.blog.csdn/article/details/112303581


 

2、进阶案例

ML之FE:基于load_mock_customer数据集(模拟客户)利用featuretools工具实现自动构造特征/特征衍生

https://yunyaniu.blog.csdn/article/details/115364577

ML之FE:基于load_mock_customer数据集(模拟客户,单个DataFrame)利用featuretools工具实现自动构造特征/特征衍生

https://yunyaniu.blog.csdn/article/details/115440698

ML之FE:基于BigMartSales数据集利用Featuretools工具(1个dataframe表结构切为2个Entity表结构)实现自动构造特征/特征衍生应用案例

https://yunyaniu.blog.csdn/article/details/115601605

ML之FE:基于自定义数据集(银行客户信息贷款和赔偿)对比实现特征衍生(手动设计新特征、利用featuretools工具实现自动构造特征/特征衍生)

https://yunyaniu.blog.csdn/article/details/112303581

本文标签: 使用方法攻略简介详细PY