Reading csv file with multiple delimiters in pandas
This is a memorandum about reading a csv file with read_csv of Python pandas with multiple delimiters.
specifying the delimiter using sep (or delimiter) with stuffing these delimiters into “”
So I’ll try it right away.
Suppose I have the following csv file (tempo.csv) and I want to read it as separated with some delimiters (the right side of the time has a tab).
s1,s2;s3,datetime f1,f2,f3 a,b;c,2020/07/27 03:00 1.2,3.4,5.6 d,e;f,2021/09/28 13:03 2.3,4.5,6.7 g,h;i,2022/11/29 23:45 3.4,5.6,7.8
Here, let’s use the following seven types of delimiters to separate them.
“,” “;” “/” ” ” (space) “:” “t”(tab) “.”
How to specify the delimiter with sep (or delimiter) is just writing multiple delimiters in  like this.
sep = “”
And specify engine =’python’ together.
# import pandas import pandas as pd #specifying the delimiter with sep (or delimiter), put multiple delimiters into "[ ]" . #and specify engine ='python' df = pd.read_csv("tempo.csv", sep = "[,;/ :t.]", engine='python') df
By the way, if you read a file without specifying anything, the default delimiter will be
# import pandas import pandas as pd #default delimiter is "," df = pd.read_csv("tempo.csv") df
It will be like above.
And more；reorder header
You may already know by now… Reading a csv file as divided by multiple delimiters, the column header will be shifted and indexed weirdly….
So, replace the header with a list of column names according to the newly generated columns.
# import pandas import pandas as pd #Create a list of colmumn names in advance cols = ["s1","s2","s3","year","month","day","h","m","f1-1","f1-2","f2-1","f2-2","f3-1","f3-2"] #Specify the list of column names as "names" df = pd.read_csv("tempo.csv", sep = "[,;/ :t.]", engine='python', names = cols) #drop original header df.drop(df.index, inplace = True) df
Reference site. Thank you.
wpX Speed / wpX