Data formatters

data_process.formatForD3(pandas_json)[source]

Re-format JSON regression results to be more friendly to D3.js, the plotting library used by the web front-end.

JSON output from pandas is ‘dict-like’, that is regression coefficients and values are stored as key-val pairs in a dictionary.

There is no particulary convenient way to handle this type of data using D3.js, which works best with an array of two-element dictionaries where one key in the dictionary is the element ‘Name’ and the other key is the element ‘Value’.

As an example this function would take the pandas formatted JSON object {0: {Intercept: 20.8178307102, rm: 4.2452889618, lstat: -0.5213794328}}

and return

[{Name: “lstat”, Val: -0.5213794328}, {Name: “Intercept”, Val: 20.8178307102},{Name: “lstat”, Val: -0.5213794328}]

Args:
pandas_json: JSON object output from pandas.DataFrame.to_json(orient=’columns’)
Returns:
Reformatted JSON object.

Base Flask App

hdim_app.allowed_file(filename)[source]

Determine wether a given file has an allowable file type.

Args:
filename: Name of the file ( data.csv for example ).
Returns:
False if filename has an extension not listed the app config. variable “ALLOWED_EXTENSIONS”, True otherwise.
hdim_app.index()[source]

Display the root HTML file.

hdim_app.regress()[source]

Perform an L1 regularized ( ie LASSO ) regression on a data set.

Args:

Data will be passed from a client (website, cURL, Excel App etc ) via an HTTP POST request.

The POST request form-data should specify the design matrix, the vector of predictors and any header information.

The request MUST also contain the following fields in addition to the form data:
regression_type: specifices the regression method to use data_type: specifies what format the data is in regression_index: the index of the column containing the vector of predictors

Returns:

A JSON object containing the result of the regression, including Intercept term.

This object will be an array of two-element dictionaries. The dictionaries define the individual regression coefficients and are formatted as follows: {Name:”coefficient_name”, Val: coefficient_val}.

An example response showing two coefficients and an intercept term is shown below. [{Name: “rm”, Val: 4.2}, {Name: “Intercept”, Val: 20.8},{Name: “lstat”, Val: -0.5}]