Reading csv file with multiple delimiters in pandas

Introduction

This is a memorandum about reading a csv file with read_csv of Python pandas with multiple delimiters.

specifying the delimiter using sep (or delimiter) with stuffing these delimiters into “[]”

So I’ll try it right away.

Details

Suppose I have the following csv file (tempo.csv) and I want to read it as separated with some delimiters (the right side of the time has a tab).

s1,s2;s3,datetime       f1,f2,f3
a,b;c,2020/07/27 03:00  1.2,3.4,5.6
d,e;f,2021/09/28 13:03  2.3,4.5,6.7
g,h;i,2022/11/29 23:45  3.4,5.6,7.8

Here, let’s use the following seven types of delimiters to separate them.

“,” “;” “/” ” ” (space) “:” “t”(tab) “.”

How to specify the delimiter with sep (or delimiter) is just writing multiple delimiters in [] like this.

sep = “[]”

And specify engine =’python’ together.

# import pandas 
import pandas as pd

#specifying the delimiter with sep (or delimiter), put multiple delimiters into "[ ]" .
#and specify engine ='python'

df = pd.read_csv("tempo.csv", sep = "[,;/ :t.]", engine='python')
df

Done!

By the way, if you read a file without specifying anything, the default delimiter will be

“,”

Therefore

# import pandas
import pandas as pd

#default delimiter is ","
df = pd.read_csv("tempo.csv")
df

It will be like above.

And more;reorder header

You may already know by now… Reading a csv file as divided by multiple delimiters, the column header will be shifted and indexed weirdly….

So, replace the header with a list of column names according to the newly generated columns.

# import pandas
import pandas as pd

#Create a list of colmumn names in advance
cols = ["s1","s2","s3","year","month","day","h","m","f1-1","f1-2","f2-1","f2-2","f3-1","f3-2"]

#Specify the list of column names as "names" 
df = pd.read_csv("tempo.csv", sep = "[,;/ :t.]", engine='python', names = cols)

#drop original header
df.drop(df.index[0], inplace = True)
df

All done!

Reference site. Thank you.
https://stackoverflow.com/questions/26551662/import-text-to-pandas-with-multiple-delimiters
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html?highlight=delimiter%20csv

Environment

Python; 3.7.2
pandas; 1.0.5


ちょっと広告です
https://business.xserver.ne.jp/

https://www.xdomain.ne.jp/

★LOLIPOP★

.tokyo

MuuMuu Domain!