Transforming Data – Learn It 2

  • Decide which transformation to use for the different type of data sets and analyze the results

Transformation of Data

A transformation has the effect of making the data less skewed and making the variation more uniform.

data transformation

[1]In statistics, data transformation is the application of a deterministic mathematical function to each point in a data set. Transformations are usually applied so that the data appear to more closely meet the assumptions of a statistical inference procedure that is to be applied, or to improve the interpretability or appearance of graphs.

[2]Nearly always, the function that is used to transform the data is invertible, and generally is continuous. The transformation is usually applied to a collection of comparable measurements. For example, if we are working with data on people’s incomes in some currency unit, it would be common to transform each person’s income value by the logarithm function.

Let’s recall two of the most common data transformation functions: square root and logarithm.

The square root of a number is a value that, when multiplied by itself, gives the number.[3]For example: [latex]\sqrt{9}=3[/latex]

A logarithm answers the question, “To what power must we raise one number to get another number?”

 

For example, consider the question: “To what power must we raise 2 to get 8?” We see that [latex]2 \cdot 2 \cdot 2 = 2^3 = 8[/latex]. The way we write this logarithm is [latex]\text{log}_2(8)=3[/latex].

 

In general, the statements

[latex]b^x = a[/latex] and [latex]\text{log}_b(a)=x[/latex]

contain the same information. In both the exponential form and the logarithmic form, the quantity [latex]b[/latex] is called the base.

  • A base that is often used in logarithms is [latex]10[/latex]; instead of writing [latex]\text{log}_{10}(x)[/latex], we often just write [latex]\text{log}(x)[/latex].

[latex]\text{log}_{10}(x) = \text{log}(x)[/latex]

  • Another common base that you may encounter is the irrational number [latex]e[/latex], which is approximately equal to [latex]2.718[/latex]; instead of writing [latex]\text{log}_e(x)[/latex], we often just write [latex]\text{ln}(x)[/latex] and call this the “natural logarithm of [latex]x[/latex].”

[latex]\text{log}_e(x) = \text{ln}(x)[/latex]


  1. https://en.wikipedia.org/wiki/Data_transformation_(statistics)
  2. https://en.wikipedia.org/wiki/Data_transformation_(statistics)
  3. Definition of square root. (n.d.). Mathisfun.com. https://www.mathsisfun.com/definitions/square-root.html