鹰潭网站商城建设注册新公司名称查询
课程学习来源:b站up:【蚂蚁学python】
 【课程链接:【【数据可视化】Python数据图表可视化入门到实战】】
 【课程资料链接:【链接】】
Python绘制散点图查看BMI与保险费的关系
散点图:
- 用两组数据构成多个坐标点,考察坐标点的分布,判断两变量之间是否存在某种关联或总结坐标点的分布模式
 - 散点图核心的价值在于发现变量之间的关系,然后进行预测分析,做出科学的决策
 
实例:医疗费用个人数据集中,"身体质量指数BMI"与"个人医疗费用"两者之间的关系
数据集原地址:https://www.kaggle.com/mirichoi0218/insurance/home
1.读取保险费数据集
import pandas as pddf = pd.read_csv("../DATA_POOL/PY_DATA/ant-learn-visualization-master/datas/insurance/insurance.csv")df.head(10)
 
| age | sex | bmi | children | smoker | region | charges | |
|---|---|---|---|---|---|---|---|
| 0 | 19 | female | 27.900 | 0 | yes | southwest | 16884.92400 | 
| 1 | 18 | male | 33.770 | 1 | no | southeast | 1725.55230 | 
| 2 | 28 | male | 33.000 | 3 | no | southeast | 4449.46200 | 
| 3 | 33 | male | 22.705 | 0 | no | northwest | 21984.47061 | 
| 4 | 32 | male | 28.880 | 0 | no | northwest | 3866.85520 | 
| 5 | 31 | female | 25.740 | 0 | no | southeast | 3756.62160 | 
| 6 | 46 | female | 33.440 | 1 | no | southeast | 8240.58960 | 
| 7 | 37 | female | 27.740 | 3 | no | northwest | 7281.50560 | 
| 8 | 37 | male | 29.830 | 2 | no | northeast | 6406.41070 | 
| 9 | 60 | female | 25.840 | 0 | no | northwest | 28923.13692 | 
df.info()
 
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1338 entries, 0 to 1337
Data columns (total 7 columns):#   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  0   age       1338 non-null   int64  1   sex       1338 non-null   object 2   bmi       1338 non-null   float643   children  1338 non-null   int64  4   smoker    1338 non-null   object 5   region    1338 non-null   object 6   charges   1338 non-null   float64
dtypes: float64(2), int64(2), object(3)
memory usage: 73.3+ KB
 
2.pyecharts绘制散点图
# 将数据按照bmi升序排列
df.sort_values(by = "bmi", inplace = True)# inplace =true 表示直接更改df本身的数据
df.head()
 
| age | sex | bmi | children | smoker | region | charges | |
|---|---|---|---|---|---|---|---|
| 172 | 18 | male | 15.960 | 0 | no | northeast | 1694.79640 | 
| 428 | 21 | female | 16.815 | 1 | no | northeast | 3167.45585 | 
| 1226 | 38 | male | 16.815 | 2 | no | northeast | 6640.54485 | 
| 412 | 26 | female | 17.195 | 2 | yes | northeast | 14455.64405 | 
| 1286 | 28 | female | 17.290 | 0 | no | northeast | 3732.62510 | 
bmi = df["bmi"].to_list()
charges = df["charges"].to_list()
 
import pyecharts.options as opts
from pyecharts.charts import Scatter
 
scatter = (Scatter().add_xaxis(xaxis_data = bmi).add_yaxis(series_name = "",y_axis = charges,symbol_size = 4,label_opts = opts.LabelOpts(is_show = False)).set_global_opts(xaxis_opts = opts.AxisOpts(type_ = "value"),yaxis_opts = opts.AxisOpts(type_ = "value"),title_opts = opts.TitleOpts(title = "(BMI-保险费)关系图", pos_left = "center"))
)
 
from IPython.display import HTML# 同上,读取 HTML 文件内容
# bar.render()的值是一个路径,以字符串形式表示
with open(scatter.render(), 'r', encoding='utf-8') as file:html_content = file.read()# 直接在 JupyterLab 中渲染 HTML
HTML(html_content)
 

