By the end of this lesson, developers should be able to:
- Iterate through a file one line at a time.
- Explain why you should only use the block form of
File.open
. - Load data using the CSV library in order to create Ruby objects.
- Fork and clone this repository. FAQ
- Create a new branch,
training
, for your work. - Checkout to the
training
branch. - Install dependencies with
bundle install
.
In Ruby, files, and all IO streams, are Enumerable. You're familiar with files
and folders. Examples of IO streams, other than files, are stdin
, stdout
,
and stderror
.
Ruby's File type mixes-in Enumerable via its parent class, IO). Therefore, we can use all of the Enumerable methods to process files. That means, to Ruby at least, files are just lists, and we can process them in chunks, either a character or a line at a time. By default, Ruby will process files one line at a time.
Other enumerable classes related to working with files include IO (mentioned above) and Dir. Dir is Ruby's abstraction for working with directory structures.
Using bin/read_file.rb
we'll read all the lines in a file
and print them.
Let's create a script to mimic the behavior of the wc
(word count) command
line utility in bin/word_count.rb
.
A file containing Comma Separated Values (CSV) is a simple and well supported format for data interchange, especially for tabular data. It is an open format describing plain text (data) separated by commas. It may or may not have a "header" as the first line, describing "columns" of data. Each piece of data is represented as a "row".
Tabular data, columns, and rows? We're dealing with a spreadsheet (without
formulas). There is an example in data/people.csv
.
- Watch as I open the file in my text editor.
- Watch as I open the file in a spreadsheet program.
We previously used the Ruby class CSV to load data for us. The CSV class is part of the standard library, which means that we can use it without downloading a gem.
We used CSV for the bin/people_array.rb
script in
Ruby Array Methods.
- Watch as I run
bin/people_array.rb
. - Note how I inspect the files associated with the script, specifically
looking for how the
CSV
class is used.
We'll build a data loader for pets in lib/pets.rb
using
CSV.
We'll use a lambda
to ensure we use properly formatted symbols as keys when
loading data. We'll use the shorthand syntax, sometimes called the "stabby"
lambda, ->([args]) {[code]}
.
A lambda
in Ruby is the closest thing to an anonymous function
that we have. Ruby lambdas verify arity do not stop execution during their
return; they are different from Procs in that regard (Proc docs).
Instead of using a lambda to convert our headers, we could pass a symbol from
HeaderConverters
as the value for :header_converters
in the options Hash.
Read two files at the same time using bin/read_files.rb
.
Look at Enumerator which is
what gets returned when we call each
on an open file without a block.
We'll need to look briefly at exception handling as Enumerator relies on this mechanism.
- Annotate
lib/people.rb
if you haven't already. - Annotate
lib/person.rb
. Start with annotating what you can easily annotate. Then, methodically and actively reading the difficult portions, annotate each remaining line. - Annotate
lib/pets.rb
if you haven't already. - Annotate
lib/pets.rb
. Start with annotating what you can easily annotate. Then, methodically and actively reading the difficult portions, annotate each remaining line. - Compare and contrast the person files as a group with the pet files as a group.
Developers should run these often!
bin/rake nag
(orbundle exec rake nag
): runs code quality analysis tools on your code and complains.bin/rake test
(orbundle exec rake test
): runs automated tests.bin/rake
will run bothnag
andtest
- All content is licensed under a CCBYNCSA 4.0 license.
- All software code is licensed under GNU GPLv3. For commercial use or alternative licensing, please contact [email protected].