analysis专栏 03.数据处理分析模块 07.拼接与合并
# 07.拼接与合并[toc]{type: "ol", level: [2,3,4,5]}### 拼接-Series对象```pythonimport pandas as pdser1 = pd.Series([1, 2, 3], index=list('ABC'))ser2 = pd.Series([4, 5, 6], index=list('DEF'))pd.concat([ser1, ser
·

# 07.拼接与合并
[toc]{type: "ol", level: [2,3,4,5]}
### 拼接-Series对象
```python
import pandas as pd
ser1 = pd.Series([1, 2, 3], index=list('ABC'))
ser2 = pd.Series([4, 5, 6], index=list('DEF'))
pd.concat([ser1, ser2])
```

### 拼接-DataFrame对象
#### 数据准备
- 一层循环
```python
{c: [] for c in 'AB'}
```
::: details result
{'A': [], 'B': []}
:::
- 二层循环
```python
data = {c: [str(c) + str(i) for i in [1, 2]] for c in 'AB'}
data
```
::: details result
{'A': ['A1', 'A2'], 'B': ['B1', 'B2']}
:::
- 生成数据
```python
def make_df(cols, index):
data = {c: [str(c) + str(i) for i in index] for c in cols}
return pd.DataFrame(data, index=index)
df1 = make_df('AB', [1, 2])
df2 = make_df('AB', [3, 4])
df1
df2
```


#### 竖直拼接[默认]
> 拼接线平行于x轴
```python
pd.concat([df1, df2])
```

#### 水平拼接[指定]
> 拼接线平行于y轴
```python
pd.concat([df1, df2], axis=1)
```

#### 重新定义行号
- 方式1
```python
pd.concat([df1, df2], ignore_index=True)
```

- 方式2
```python
pd.concat([df1, df2], keys=list('xy'))
```

#### 取交集
- 数据准备
```python
df1 = make_df('ABC', [1, 2])
df2 = make_df('BCD', [3, 4])
df1
df2
```


- 获取交集
```python
pd.concat([df1, df2], join='inner')
```

### 数据合并
#### 数据准备
```python
left = pd.DataFrame({
'key': ['k0', 'k1', 'k2', 'k3'],
'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3'],
})
right = pd.DataFrame({
'key': ['k0', 'k1', 'k2', 'k4'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3'],
})
left
right
```


#### 交集合并
```python
# 默认:how='inner'
pd.merge(left, right, how='inner')
```

#### 补集合并
```python
pd.merge(left, right, how='outer')
```

#### 其他合并
- left
> 以左侧存在的key为基准
- right
> 以右侧存在的key为基准
更多推荐



所有评论(0)