Lathe.preprocessModule

|====== Lathe.preprocess =====

|____________/ Generalized Processing ___________

|_____preprocess.TrainTestSplit

|_____preprocess.SortSplit

|_____preprocess.UniformSplit

|____________/ Feature Scaling ___________

|_____preprocess.Rescalar

|_____preprocess.ArbitraryRescale

|_____preprocess.MeanNormalization

|_____preprocess.StandardScalar

|____________/ Categorical Encoding ___________

|_____preprocess.OneHotEncoder

|_____preprocess.OrdinalEncoder

|_____preprocess.FloatEncoder

Lathe.preprocess.ArbitraryRescalerType

Arbitrary Rescaler

Description

Arbitrarily rescales an array.


Input

ArbitraryRescaler(x)


Positional Arguments

Array{Any} - x:: Array for which the original scaler should be based off of.


Output

scalar :: A Lathe Preprocesser object.

Functions

Preprocesser.predict(xt) :: Applies the scaler to xt.


Data

a :: The minimum value in the array.

b :: The maximum value in the array.

Lathe.preprocess.FloatEncoderType

Float Encoder

Description

Float/Label Encodes an array.


Input

OneHotEncoder()


Output

encoder :: A Lathe Preprocesser object.

Functions

Preprocesser.predict(xt) :: Returns an ordinally encoded xt.

Lathe.preprocess.MeanScalerType

Mean Normalizer

Description

Normalizes an array using the mean of the data.


Input

ArbitraryRescaler(x)


Positional Arguments

Array{Any} - x:: Array for which the original scaler should be based off of.


Output

scalar :: A Lathe Preprocesser object.

Functions

Preprocesser.predict(xt) :: Applies the scaler to xt.


Data

a :: The minimum value in the array.

b :: The maximum value in the array.

avg :: The mean of the array.

Lathe.preprocess.OneHotEncoderType

OneHotEncoder

Description

One Hot Encodes a dataframe column into a dataframe.


Input

OneHotEncoder()


Output

encoder :: A Lathe Preprocesser object.

Functions

Preprocesser.predict(df, symb) :: Applies the encoder to the dataframe key corresponding with symb on DF, then returns a dataframe with encoded results.

Lathe.preprocess.OrdinalEncoderType

Ordinal Encoder

Description

Ordinally Encodes an array.


Input

OrdinalEncoder(x)


Positional Arguments

Array{Any} - x:: Array for which the original scaler should be based off of.


Output

encoder :: A Lathe Preprocesser object.

Functions

Preprocesser.predict(xt) :: Returns an ordinally encoded xt.

Lathe.preprocess.RescalerType

Rescalar

Description

Rescales an array.


Input

Rescaler(x)


Positional Arguments

Array{Any} - x:: Array for which the original scaler should be based off of.


Output

scalar :: A Lathe Preprocesser object.

Functions

Preprocesser.predict(xt) :: Applies the scaler to xt.


Data

min :: The minimum value in the array.

max :: The maximum value in the array.

Lathe.preprocess.StandardScalerType

Standard Scaler

Description

Normalizes an array using the z (Normal) distribution.


Input

StandardScaler(x)


Positional Arguments

Array{Any} - x:: Array for which the original scaler should be based off of.


Output

scalar :: A Lathe Preprocesser object.

Functions

Preprocesser.predict(xt) :: Applies the scaler to xt.


Data

dist :: Returns the normal distribution object for which this scaler uses.

Lathe.preprocess.SortSplitFunction

Sort Split

Description

Sorts an array, and then splits said array.


Input

SortSplit(x, .75, false)


Positional Arguments

Array{Any} - data:: The data to split.

Float64 - at:: A percentage that determines where the data is split.

Bool - rev:: Determines whether the order of the sort should be reversed.


Output

train:: The larger half of the split set.

test:: The smaller half of the split set.

Lathe.preprocess.TrainTestSplitFunction

TrainTestSplit

Description

Splits an array or dataframe into two smaller groups based on the percentage provided in the at parameter.


Input

TrainTestSplit(x, .75)


Positional Arguments

Array{Any}, DataFrame - data:: The data to split.

Float64 - at:: A percentage that determines where the data is split.

Output

train:: The larger half of the split set.

test:: The smaller half of the split set.

Lathe.preprocess.UniformSplitFunction

Uniform Split

Description

Uniform Split will split an array without shuffling the data first.


Input

UniformSplit(x, .75)


Positional Arguments

Array{Any} - data:: The data to split.

Float64 - at:: A percentage that determines where the data is split.


Output

train:: The larger half of the split set.

test:: The smaller half of the split set.

Lathe.preprocess.@onehotMacro

OneHotEncodes a dataframe

Takes a symbol representing the column to one hot encode from and a DF.


df = (:A => ["hello","world"], :B => ["Foo", "Bar"])

encoded = @onehot df, :A