# Linear regression

In advance I admit my statistical amateur status, but I still have numbers to crunch and a job to do so…

I have considered the sample model that uses linear regression on stock prices, etc. That model uses dates as the independent variable, x.

I am trying to do linear regression where x is not based on time but on throughput, hits on a web server, for example. The dependent variable would be- for instance- a measurement of CPU.

The other complication is that the contents of cells are dynamic, being fed by linked matrices, which are in turn fed by a datalink. I know that i am probably over-complicating the linkages, but I am just trying to tackle one obstacle at a time. Bottom line is: no cell value can be static.

So, taking one issue at a time, I have calculated linear regression but the independent value (x) is an item cell with the dependent value (y). X needs to be on the chart as the x axis, and y on the y axis. But with them both being in the same item group(?) I cannot chart them like that.

How do I rectify that?

Vince

Though I didn’t express it properly, I was trying solve for x given y and the intercept and slope.

Based on all the tips I was given in the forums I did manage to work that out. I’m happy to say that I was able to achieve all my objectives with the model I was working on based on the advice I received in the forums.

Thanks again.

Vince, I would think you would be able to calculate a y for any x since you have the interecept and slop. So it would be do-able.

Steve

After reviewing the attached model I have to say LOL. I am so relieved.

The x,y plot is exactly what I need. And again, it was something simple that I overlooked. There is a lot to absorb in this product. But, since I have your attention…

What if I now wanted to project the y values based on the linear regression slope and a succession of higher x-values? The idea is to use the regression line and x (Hits for instance) to predict y (%CPU for instance). Do-able?

Vince

Pete and Steve,

Thanks for your help. The resolutions I am getting to my questions on the forums is leading up to some impressive results here. When I have the pieces together I’ll share what we’re doing with you.

I am absorbing both suggestions now and will reply with my results shortly.

Vince

Vince,

I discussed your question with Steve today and think we have another approach for you. It seems like you might be looking to graphically present the relationship between your measured data and the linear regression. Quantrix has a couple of chart types that might help you with this: Plot and Scatter Plot. In both cases, these chart plot x vs y where x and y are arranged in columns in the matrix.

In your model, I inserted another column for Load and moved the existing load to the left of the Linear Regression item. This allowed for two pairs of xs and ys. Then I sorted by Load to get the datapoints in an order that would plot reasonably – rather than ordered by sampling time. This, then lead to a reasonable chart which shows the linear regression vs the measured data on one plot. The file is currently configured with a plot (where the points are connected by lines) but you can change to a scatter and see what it shows. Its also fun to clear the sort on the first column in a plot and see the points connected in order of time rather than load.

Here’s a link to the model as our forum software apparently can’t handle the increased file size.

[url:slet4jis]https://www.quantrix.com/s/forum-attachments/TVDW_WMTPUI3_Profile_v2_export-WITHCHART.model[/url:slet4jis]

Does this get at what you are trying to do?

Cheers,

pete
peter m. murray
pete@quantrix.com

Vince, I looked over your model and noted that there was 21 million cells which is impressive; however, it was too impressive for my laptop. I tried the solution I described below but I cannot verify for you that it worked.

What you might consider is creating a new matrix via a datalink that will take the x and y values and datalink them as categories plus what you want to plot as an item. If you set the selection on the datalink wizard correctly, this will be a dynamic datalink so as the values in the source matrix changes the “charting” matrix will change as well. I did something similiar at:

I hope this works and please let me know your results. If not, maybe we can trim the model down data wise so that we I can work out a tested solution.

Steve

Here is the steps I took with the datalink wizard:
Select data source as two dimensional matrix data
Select the category that contains your x, y and plotting data (“Item” in your case?)
Select data destination as Multi-dimensional OLAP Analysis
Create new matrix
Select your columns to include (x, y and plotting data)
Set your x and y to category and plotting data to item (sort if necessary)
Set any summary choices
Set “Remove previously imported category items and data”
Hit finish (this is were I needed more horsepower)

Steve,

Model is attached.

CFI is the matrix that is normally datalinked, but i removed it for this export.

Thanks for the help,

Vince

Thanks Steve. I’m cleaning up an excerpted model to attach. Should post in a minute or two.

Vince

Vince, I would like to help you out. Is there a model you can send me to look at?

Steve