|
|
Line 1: |
Line 1: |
| '''Correlate summation analysis''' is a [[data mining]] method. It is designed to find the [[Variable (mathematics)|variables]] that are most [[covariance|covariant]] with all of the other variables being studied, relative to [[Data clustering|clustering]]. Aggregate correlate summation is the product of the totaled negative [[logarithm]] of the [[p-value]]s for all of the [[correlation]]s to a given variable and its (normalized) [[standard deviation]]-to-[[mean]] quotient. Discrete correlate summation is the product of the totaled absolute value of the logarithm of the p-value ratios between two groups' correlations to a given variable and its absolute value of the logarithm of the group mean ratios. | | Hi there. Allow me start by introducing the author, her name is Sophia Boon but she never really favored that name. For years she's been working as a journey agent. North Carolina is the location he loves most but now he is considering other options. What me and my family members adore is bungee jumping but I've been taking on new things lately.<br><br>Feel [http://ltreme.com/index.php?do=/profile-127790/info/ free online tarot card readings] to surf to my blog post; [http://fashionlinked.com/index.php?do=/profile-13453/info/ telephone psychic] phone readings ([http://cartoonkorea.com/ce002/1093612 Read More On this page]) |
| | |
| ==Correlate summation template==
| |
| | |
| This zipped Excel template performs a correlate summation analysis for up to 100 variables for 4 groups of 15 subjects:
| |
| | |
| [http://sites.google.com/site/correlatesummationtemplate/Home/correlate-summation-template/correlate.zip?attredirects=0] | |
| | |
| The paper <ref name=rat>{{cite journal|author=Westwood, B|coauthors=Chappell, M.|year=2006|title=Proceedings of the 1st international workshop on Text mining in bioinformatics - TMBIO '06|publisher=TMBIO '06 (ACM)|pages=21–26|doi=10.1145/1183535.1183542|chapter=Application of correlate summation to data clustering in the estrogen- and salt-sensitive female mRen2.Lewis rat|isbn=1-59593-526-6}}
| |
| </ref> describing the method is embedded in the spreadsheet.
| |
| | |
| ==Discrete correlate summation==
| |
| Given two groups, a correlation [[matrix (mathematics)|matrix]] (''m'' by ''m'') was constructed for ''m'' variables for each group. Each column represents all of the correlations (''r'') between a given variable and each of the other variables. For variables with either heterogeneous or homogeneous numbers of data points (''n''), the ''n'' for each individual correlation was calculated by assigning each data point with a value of one and taking the sum of the products for each pair in that correlation.
| |
| | |
| The correlations were tested for linearity using [[Student's t-distribution]] to evaluate:
| |
| | |
| :<math>t=\frac{|r|}{\sqrt{\frac{1-r^2}{n-2}}}</math>
| |
| | |
| for (''n'' − 2) degrees of freedom, returning two tails.<ref>Swinscow, T. (1997) ''[http://www.bmj.com/collections/statsbk/11.dtl Statistics at Square One]''. BMJ Publishing Group.</ref>
| |
| | |
| The correlation matrices were thus transformed into linear probability matrices. For the two groups, the absolute value of the logarithm of the ratio of each comparison’s p-value gives a log correlation ratio that is larger as the ratio approaches zero or infinity. Each column was totaled to form the discrete correlate summation array. As in the log correlation ratio (log<sub>cr</sub>), the log mean ratio (log<sub>mr</sub>) for the two groups’ means was acquired for each variable. The correlate summation was then multiplied by the log mean ratio, to yield the discrete mean-correlate summation (DCΣ<sub>x</sub>).<ref name=rat/>
| |
| | |
| ==Aggregate correlate summation==
| |
| As in the discrete correlate summation, a linear probability matrix was calculated for all of the data (no grouping). The negative logarithm was taken for all of the p-values; the columns were totaled to give the aggregate correlate summation (ACΣ) array. The standard deviation for each variable is divided by its mean to normalize the variances between variables. Data with a [[bimodal distribution]] will have a larger normalized standard deviation (nSD) than will data with a [[normal distribution]]. The nSD array multiplied by the ACΣ array yielded the aggregate mean-correlate summation (ACΣ<sub>x</sub>).<ref name=rat/>
| |
| | |
| ==Non-linear modeling==
| |
| | |
| A linear correlation between variables for a given sample set is typically the initial step in the investigation of relationships, which may lead to an underlying mechanism. The variation (either inherent or in response to a challenge) in a given population gives rise to correlations of variables of which only a portion of the [[sigmoid function|sigmoidal]] ([[Control system|control]]) relationship may be evident. Generally in the face of data that defies [[linear regression]], data patterns indicate [[power (mathematics)|power]] relationship of the general type:
| |
| | |
| :<math>y=mx^a</math>
| |
|
| |
| Type 1: ''a'' < 0 is a [[hyperbolic function]]
| |
| | |
| Type 2: ''a'' = 0 is a horizontal line
| |
| | |
| Type 3: 0 < ''a'' < 1 is a [[Nth root|root]] function
| |
| | |
| Type 4: ''a'' = 1 is actually a linear function
| |
| | |
| Type 5: ''a'' > 1 is a power function
| |
| | |
| (In all five cases a [[log-log plot]] yields a linear curve.) <ref>Mandel, J. (1984) ''The Statistical Analysis of Experimental Data''. Dover Publications, Mineola, NY.</ref>
| |
| | |
| On a positive sigmoidal/[[logistic curve]], the initial, intermediate and late portions resemble power, linear and root functions, respectively<!-- (Figure 1) -->. Also, the late portion of a negative control function is reminiscent of a hyperbolic curve. | |
| | |
| In an analysis of variable correlation, the sigmoidal relationship of the entire (unsampled in some cases) data range should be considered. This type of analysis is accomplished by regression with either a logistic curve or [[simple linear regression]] with further investigation of the Type 1, 3 and 5 power relationships.<ref name=rat/>
| |
| | |
| ==References==
| |
| {{Reflist}}
| |
| | |
| [[Category:Covariance and correlation]]
| |
Hi there. Allow me start by introducing the author, her name is Sophia Boon but she never really favored that name. For years she's been working as a journey agent. North Carolina is the location he loves most but now he is considering other options. What me and my family members adore is bungee jumping but I've been taking on new things lately.
Feel free online tarot card readings to surf to my blog post; telephone psychic phone readings (Read More On this page)