Task
Create an R object that contains the data from a comma-separated file (which probably has the file extension “csv”). We assume the data are rectangular — that is, that we can think of it as being in rows and columns.
Preparation
None, other than starting R.
Doing it
superbowl <- read.table( "http://www.portfolioprobe.com/R/blog/dowjones_super.csv", sep=",", header=TRUE)
This command should work for you if you copy and paste it into an R session where you have access to the internet.
Explanation
Our call to the read.table function has used three arguments:
- the location of the file — typically this would be a file on your file system.
- the
separgument says what character is used to separate items. - the
headerargument says whether or not the first line in the file contains labels.
The result is a data frame. This is a rectangular object that can have different types of items in each column.
We have used the <- operator to make an assignment to the name “superbowl”. We could have used the = operator also — there is no difference in this case (and most cases). You won’t go wrong with object names if they start with a letter and include only letters, digits, dots (.) and underscores (_). Names are case-sensitive: “superbowl” is not the same as “superBowl”.
We can check to see if the object looks like what we expect:
> dim(superbowl) [1] 45 4 > head(superbowl) Winner DowJonesSimpleReturn DowJonesUpDown DowJonesCorrect 1967 National 0.15199379 Up correct 1968 National 0.04269094 Up correct 1969 American -0.15193642 Down correct 1970 American 0.04817832 Up wrong 1971 American 0.06112621 Up wrong 1972 National 0.14583240 Up correct
So the object we get has 45 rows and 4 columns. The first few rows are shown with head. The tail function shows the last few rows.
Further details
coerce to matrix
If all of the data in the file are numeric (except possibly row and column labels), then you may want to coerce the result into a matrix. You would do that with something like:
> myMatrix <- as.matrix(read.table(filename, sep=",", + header=TRUE)
no column names
If there are no column labels, then you would use header=FALSE or say nothing since FALSE is the default value.
row names
The year was automatically selected as the row names in the example. In this case, it is a pretty good choice, but you can control which data, if any, is selected to be the row names.
unknown file name
If you are on Windows and you aren’t exactly sure of the file name you want (or it is too much bother to type it), then you can use the file.choose function:
myObj <- read.table(file.choose(), sep="\t", header=TRUE)
This pops up a window in which you can choose the file that you want.
help
You can get the help for read.table with the command:
?read.table
See also
Navigate
- Back to “Data Basics”
- Back to the top level of “Portfolio Probe Cookbook”
