If you want to convert a CSV file into Pandas, you can use [pandas.read_csv][readcsv].
The function takes several options. One of them is sep (default value is ,).
You can use a regular expression to customize the delimiter.
Let’s say your data looks like this:
vhigh,high,2,2,more,small
med,vhigh,3,more,big
...
You want to load that data into a Pandas DataFrame. You can split each line on the comma, but you want to ignore the comma inside floating point numbers like 2.2.
pd.read_csv("../path/to/file.csv", sep="(?<=\D)\,|\,(?=\D)", engine="python");
That’s quite a complicated regular expression which uses group constructs.
But Pandas is able to handle it. Use engine="python" for the parser engine.
The default parser uses C. It’s faster, but not as feature-complete.
You’ll get a result similar to this:
| 0 | 1 | 2 | 3 | 4 |
|---|---|---|---|---|
| vhigh | high | 2,2 | more | small |
| med | vhigh | 3 | more | big |