RFC 4180 compliant CSV parsing and encoding for Elixir. Allows to specify other separators, so it could also be named: TSV. Why it is not idk, because of defaults I think.
It parses files which contain rows (in utf-8) separated by either commas or other separators.
If that's not enough reason to absolutely โค๏ธ ๐ ๐ โค๏ธ ๐ ๐ it, it also parses a CSV file in order about 2x times as fast as an unparallelized stream implementation, and if you don't care about the order of rows in your stream, it can deliver about 3x - 4x the speeds depending on your hardware. ๐
CSV
does not care about order by default, which makes it blazing fast while hogging down your CPU. Pass num_pipes: 1
to make it process rows in order they're given in the file, and make it use less of your available processing power.
Now.
Add
{:csv, "~> 1.1.0"}
to your deps in mix.exs
like so:
defp deps do
[
{:csv, "~> 1.1.0"}
]
end
Do this to decode:
File.stream!("data.csv") |> CSV.decode
And you'll get a stream of rows. So, this is upcasing the text in each cell of a tab separated file because someone is angry:
File.stream!("data.csv") |>
CSV.decode(separator: ?\t) |>
Enum.map fn row ->
Enum.each(row, &String.upcase/1)
end
Do this to encode a table (two-dimensional array):
table_data |> CSV.encode
And you'll get a stream of lines ready to be written to an IO. So, this is writing to a file:
file = File.open!("test.csv")
table_data |> CSV.encode |> Enum.each(&IO.write(file, &1))
Pass in another separator to the decoder:
File.stream!("data.csv") |> CSV.decode(separator: ?\t)
If you want to take revenge on whoever did this to you, encode with semicolons like this:
your_data |> CSV.encode(separator: ?;)
Make sure your data gets encoded the way you want - implement the CSV.Encode
protocol for whatever strange you wish to encode:
defimpl CSV.Encode for: MyData do
def encode(%MyData{has: fun}, env \\ [])
"so much #{fun}" |> CSV.Encode.encode(env)
end
end
Or similar.
The encoding protocol implements a fallback to Any for types where a simple call to to_string
will provide
unambiguous results. Protocol dispatch for the fallback to Any is very slow when protocols are not consolidated,
so make sure you have consolidate_protocols: true
in your mix.exs
or you consolidate protocols manually for production in order to get good performance.
There is more to know about everything โข๏ธ - Check the doc
MIT
Sunny
Good