1.2.3-SNAPSHOT Arrow_down_16x16

get-dataset

incanter.datasets

  • (get-dataset dataset-key & options)
Returns the sample dataset associated with the given key. Most datasets
are from R's sample data sets, as are the descriptions below.

Options:

:incanter-home -- if the incanter.home property is not set when the JVM is
started (using -Dincanter.home) or there is no INCANTER_HOME
environment variable set, use the :incanter-home options to
provide the parent directory of the sample data directory.

:from-repo (default false) -- If true, retrieves the dataset from the online repository
instead of locally, it will do this by default if incanter-home is not set.


Datasets:

:iris -- the Fisher's or Anderson's Iris data set gives the
measurements in centimeters of the variables sepal
length and width and petal length and width,
respectively, for 50 flowers from each of 3 species
of iris.

:cars -- The data give the speed of cars and the distances taken
to stop. Note that the data were recorded in the 1920s.

:survey -- survey data used in Scott Lynch's 'Introduction to Applied Bayesian Statistics
and Estimation for Social Scientists'

:us-arrests -- This data set contains statistics, in arrests per 100,000
residents for assault, murder, and rape in each of the 50 US
states in 1973. Also given is the percent of the population living
in urban areas.

:flow-meter -- flow meter data used in Bland Altman Lancet paper.

:co2 -- has 84 rows and 5 columns of data from an experiment on the cold tolerance
of the grass species _Echinochloa crus-galli_.

:chick-weight -- has 578 rows and 4 columns from an experiment on the effect of diet
on early growth of chicks.

:plant-growth -- Results from an experiment to compare yields (as measured by dried
weight of plants) obtained under a control and two different
treatment conditions.

:pontius -- These data are from a NIST study involving calibration of load cells.
The response variable (y) is the deflection and the predictor variable
(x) is load.
See http://www.itl.nist.gov/div898/strd/lls/data/Pontius.shtml

:filip -- NIST data set for linear regression certification,
see http://www.itl.nist.gov/div898/strd/lls/data/Filip.shtml

:longely -- This classic dataset of labor statistics was one of the first used to
test the accuracy of least squares computations. The response variable
(y) is the Total Derived Employment and the predictor variables are GNP
Implicit Price Deflator with Year 1954 = 100 (x1), Gross National Product
(x2), Unemployment (x3), Size of Armed Forces (x4), Non-Institutional
Population Age 14 & Over (x5), and Year (x6).
See http://www.itl.nist.gov/div898/strd/lls/data/Longley.shtml

:Chwirut -- These data are the result of a NIST study involving ultrasonic calibration.
The response variable is ultrasonic response, and the predictor variable is
metal distance.
See http://www.itl.nist.gov/div898/strd/nls/data/LINKS/DATA/Chwirut1.dat

:thurstone -- test data for non-linear least squares.

:austres -- Quarterly Time Series of the Number of Australian Residents

:hair-eye-color -- Hair and eye color of sample of students

:airline-passengers -- Monthly Airline Passenger Numbers 1949-1960

:math-prog -- Pass/fail results for a high school mathematics assessment test
and a freshmen college programming course.

:iran-election -- Vote counts for 30 provinces from the 2009 Iranian election.

Examples:
(def data (get-dataset :cars))
(def data2 (get-dataset :cars :incanter.home "/usr/local/packages/incanter"))


0 Examples top

Log in to add / edit an example.

See Also top

Log in to add a see also.

Plus_12x12 Minus_12x12 Source incanter/datasets.clj:91 top

(defn get-dataset
" Returns the sample dataset associated with the given key. Most datasets
  are from R's sample data sets, as are the descriptions below.

  Options:

    :incanter-home -- if the incanter.home property is not set when the JVM is
                      started (using -Dincanter.home) or there is no INCANTER_HOME
                      environment variable set, use the :incanter-home options to 
                      provide the parent directory of the sample data directory.

    :from-repo (default false) -- If true, retrieves the dataset from the online repository 
                       instead of locally, it will do this by default if incanter-home is not set.


  Datasets:

    :iris -- the Fisher's or Anderson's Iris data set gives the
             measurements in centimeters of the variables sepal
             length and width and petal length and width,
             respectively, for 50 flowers from each of 3 species
             of iris.

    :cars -- The data give the speed of cars and the distances taken
              to stop. Note that the data were recorded in the 1920s.

    :survey -- survey data used in Scott Lynch's 'Introduction to Applied Bayesian Statistics
               and Estimation for Social Scientists'

    :us-arrests -- This data set contains statistics, in arrests per 100,000
                   residents for assault, murder, and rape in each of the 50 US
                   states in 1973. Also given is the percent of the population living
                   in urban areas.

    :flow-meter -- flow meter data used in Bland Altman Lancet paper.

    :co2 -- has 84 rows and 5 columns of data from an experiment on the cold tolerance
            of the grass species _Echinochloa crus-galli_.

    :chick-weight -- has 578 rows and 4 columns from an experiment on the effect of diet
                     on early growth of chicks.

    :plant-growth -- Results from an experiment to compare yields (as measured by dried
                     weight of plants) obtained under a control and two different
                     treatment conditions.

    :pontius -- These data are from a NIST study involving calibration of load cells.
                The response variable (y) is the deflection and the predictor variable
                (x) is load.
                See http://www.itl.nist.gov/div898/strd/lls/data/Pontius.shtml

    :filip -- NIST data set for linear regression certification,
              see http://www.itl.nist.gov/div898/strd/lls/data/Filip.shtml

    :longely -- This classic dataset of labor statistics was one of the first used to
                test the accuracy of least squares computations. The response variable
                (y) is the Total Derived Employment and the predictor variables are GNP
                Implicit Price Deflator with Year 1954 = 100 (x1), Gross National Product
                (x2), Unemployment (x3), Size of Armed Forces (x4), Non-Institutional
                Population Age 14 & Over (x5), and Year (x6).
                See http://www.itl.nist.gov/div898/strd/lls/data/Longley.shtml

    :Chwirut -- These data are the result of a NIST study involving ultrasonic calibration.
                The response variable is ultrasonic response, and the predictor variable is
                metal distance.
                See http://www.itl.nist.gov/div898/strd/nls/data/LINKS/DATA/Chwirut1.dat

    :thurstone -- test data for non-linear least squares.

    :austres -- Quarterly Time Series of the Number of Australian Residents

    :hair-eye-color -- Hair and eye color of sample of students

    :airline-passengers -- Monthly Airline Passenger Numbers 1949-1960

    :math-prog -- Pass/fail results for a high school mathematics assessment test
                  and a freshmen college programming course.

    :iran-election -- Vote counts for 30 provinces from the 2009 Iranian election.

   Examples:
     (def data (get-dataset :cars))
     (def data2 (get-dataset :cars :incanter.home \"/usr/local/packages/incanter\"))


"
  ([dataset-key & options]
    (let [opts (when options (apply assoc {} options))
          incanter-home (or (:incanter-home opts) 
			    (System/getProperty "incanter.home")
			    (System/getenv "INCANTER_HOME"))
          ds (**datasets** dataset-key)
	  from-repo? (if (true? (:from-repo opts)) true false)
          filename (if (or (nil? incanter-home) from-repo?) 
		     (str **datasets-base-url** (ds :filename)) 
		     (str incanter-home "/" (ds :filename)))
          delim (ds :delim)
          header (ds :header)]
   (read-dataset filename :delim delim :header header))))
Vars in incanter.datasets/get-dataset: defn let
Used in 0 other vars

Comments top

No comments for get-dataset. Log in to add a comment.