Multiple linear regression for a dataset in R with ggplot2 -


i testing make analysis of sentiment on dataset. here, trying see if if there interesting observations between message volume , buzzs, message volume , scores...

there dataset looks like:

> str(data) 'data.frame':   40 obs. of  11 variables:  $ date time   : posixct, format: "2015-07-08 09:10:00" "2015-07-08 09:10:00" ...  $ subject     : chr  "mmm" "ace" "aes" "afl" ...  $ sscore      : chr  "-0.2280" "-0.4415" "1.9821" "-2.9335" ...  $ smean       : chr  "0.2593" "0.3521" "0.0233" "0.0035" ...  $ svscore     : chr  "-0.2795" "-0.0374" "1.1743" "-0.2975" ...  $ sdispersion : chr  "0.375" "0.500" "1.000" "1.000" ...  $ svolume     : num  8 4 1 1 5 3 2 1 1 2 ...  $ sbuzz       : chr  "0.6026" "0.7200" "1.9445" "0.8321" ...  $ last close  : chr  "155.430000000" "104.460000000" "13.200000000" "61.960000000" ...  $ company name: chr  "3m company" "ace limited" "the aes corporation" "aflac inc." ...  $ date        : date, format: "2015-07-08" "2015-07-08" ... 

i thought linear regression, wanted use ggplot, use code , think got wrong somewhere don't have regression lines appears... because regression weak? helped code : code of topchef

mine is:

library(ggplot2) require(ggplot2) library("reshape2") require(reshape2) data.2 = melt(data[3:9], id.vars='svolume') ggplot(data.2) +   geom_jitter(aes(value,svolume, colour=variable),) + geom_smooth(aes(value,svolume, colour=variable), method=lm, se=false) +   facet_wrap(~variable, scales="free_x") +   labs(x = "variables", y = "svolumes") 

but missunderstood don't want. new r love me.

my results

i have error:

    geom_smooth: 1 unique x value each group.maybe want aes(group = 1)? geom_smooth: 1 unique x value each group.maybe want aes(group = 1)? geom_smooth: 1 unique x value each group.maybe want aes(group = 1)? geom_smooth: 1 unique x value each group.maybe want aes(group = 1)? geom_smooth: 1 unique x value each group.maybe want aes(group = 1)? geom_smooth: 1 unique x value each group.maybe want aes(group = 1)? 

finally think possible have different colors different subjects instead of 1 color per variable please? can add regression line on every graphs?

thank help.

sample data:

       date time subject  sscore  smean svscore sdispersion svolume  sbuzz    last close        company name       date 1  2015-07-08 09:10:00     mmm -0.2280 0.2593 -0.2795       0.375       8 0.6026 155.430000000          3m company 2015-07-08 2  2015-07-08 09:10:00     ace -0.4415 0.3521 -0.0374       0.500       4 0.7200 104.460000000         ace limited 2015-07-08 3  2015-07-07 09:10:00     aes  1.9821 0.0233  1.1743       1.000       1 1.9445  13.200000000 aes corporation 2015-07-07 4  2015-07-04 09:10:00     afl -2.9335 0.0035 -0.2975       1.000       1 0.8321  61.960000000          aflac inc. 2015-07-04 5  2015-07-07 09:10:00     mmm  0.2977 0.2713 -0.7436       0.400       5 0.4895 155.080000000          3m company 2015-07-07 6  2015-07-07 09:10:00     ace -0.2331 0.3519 -0.1118       1.000       3 0.7196 103.330000000         ace limited 2015-07-07 7  2015-06-28 09:10:00     aes  1.8721 0.0609  1.9100       0.500       2 2.4319  13.460000000 aes corporation 2015-06-28 8  2015-07-03 09:10:00     afl  0.6024 0.0330 -0.2663       1.000       1 0.6822  61.960000000          aflac inc. 2015-07-03 9  2015-07-06 09:10:00     mmm -1.0057 0.2579 -1.3796       1.000       1 0.4531 155.380000000          3m company 2015-07-06 10 2015-07-06 09:10:00     ace -0.0263 0.3435 -0.1904       1.000       2 1.3536 103.740000000         ace limited 2015-07-06 11 2015-06-19 09:10:00     aes -1.1981 0.1517  1.2063       1.000       2 1.9427  13.850000000 aes corporation 2015-06-19 12 2015-07-02 09:10:00     afl -0.8247 0.0269  1.8635       1.000       5 2.2454  62.430000000          aflac inc. 2015-07-02 13 2015-07-05 09:10:00     mmm -0.4272 0.3107 -0.7970       0.167       6 0.6003 155.380000000          3m company 2015-07-05 14 2015-07-04 09:10:00     ace  0.0642 0.3274 -0.0975       0.667       3 1.2932 103.740000000         ace limited 2015-07-04 15 2015-06-17 09:10:00     aes  0.1627 0.1839  1.3141       0.500       2 1.9578  13.580000000 aes corporation 2015-06-17 16 2015-07-01 09:10:00     afl -0.7419 0.0316  1.5699       0.250       4 2.0988  62.200000000          aflac inc. 2015-07-01 17 2015-07-04 09:10:00     mmm -0.5962 0.3484 -1.2481       0.667       3 0.4496 155.380000000          3m company 2015-07-04 18 2015-07-03 09:10:00     ace  0.8527 0.3085  0.1944       0.833       6 1.3656 103.740000000         ace limited 2015-07-03 19 2015-06-15 09:10:00     aes  0.8145 0.1725  0.2939       1.000       1 1.6121  13.350000000 aes corporation 2015-06-15 20 2015-06-30 09:10:00     afl  0.3076 0.0538 -0.0938       1.000       1 0.7071  61.440000000          aflac inc. 2015-06-30 

dput

data <- structure(list(`date time` = structure(c(1436361000, 1436361000,  1436274600, 1436015400, 1436274600, 1436274600, 1435497000, 1435929000,  1436188200, 1436188200, 1434719400, 1435842600, 1436101800, 1436015400,  1434546600, 1435756200, 1436015400, 1435929000, 1434373800, 1435669800 ), class = c("posixct", "posixt"), tzone = ""), subject = c("mmm",  "ace", "aes", "afl", "mmm", "ace", "aes", "afl", "mmm", "ace",  "aes", "afl", "mmm", "ace", "aes", "afl", "mmm", "ace", "aes",  "afl"), sscore = c(-0.228, -0.4415, 1.9821, -2.9335, 0.2977,  -0.2331, 1.8721, 0.6024, -1.0057, -0.0263, -1.1981, -0.8247,  -0.4272, 0.0642, 0.1627, -0.7419, -0.5962, 0.8527, 0.8145, 0.3076 ), smean = c(0.2593, 0.3521, 0.0233, 0.0035, 0.2713, 0.3519,  0.0609, 0.033, 0.2579, 0.3435, 0.1517, 0.0269, 0.3107, 0.3274,  0.1839, 0.0316, 0.3484, 0.3085, 0.1725, 0.0538), svscore = c(-0.2795,  -0.0374, 1.1743, -0.2975, -0.7436, -0.1118, 1.91, -0.2663, -1.3796,  -0.1904, 1.2063, 1.8635, -0.797, -0.0975, 1.3141, 1.5699, -1.2481,  0.1944, 0.2939, -0.0938), sdispersion = c(0.375, 0.5, 1, 1, 0.4,  1, 0.5, 1, 1, 1, 1, 1, 0.167, 0.667, 0.5, 0.25, 0.667, 0.833,  1, 1), svolume = c(8l, 4l, 1l, 1l, 5l, 3l, 2l, 1l, 1l, 2l, 2l,  5l, 6l, 3l, 2l, 4l, 3l, 6l, 1l, 1l), sbuzz = c(0.6026, 0.72,  1.9445, 0.8321, 0.4895, 0.7196, 2.4319, 0.6822, 0.4531, 1.3536,  1.9427, 2.2454, 0.6003, 1.2932, 1.9578, 2.0988, 0.4496, 1.3656,  1.6121, 0.7071), `last close` = c(155.43, 104.46, 13.2, 61.96,  155.08, 103.33, 13.46, 61.96, 155.38, 103.74, 13.85, 62.43, 155.38,  103.74, 13.58, 62.2, 155.38, 103.74, 13.35, 61.44), `company name` = c("3m company",  "ace limited", "the aes corporation", "aflac inc.", "3m company",  "ace limited", "the aes corporation", "aflac inc.", "3m company",  "ace limited", "the aes corporation", "aflac inc.", "3m company",  "ace limited", "the aes corporation", "aflac inc.", "3m company",  "ace limited", "the aes corporation", "aflac inc."), date = structure(c(16624,  16624, 16623, 16620, 16623, 16623, 16614, 16619, 16622, 16622,  16605, 16618, 16621, 16620, 16603, 16617, 16620, 16619, 16601,  16616), class = "date")), .names = c("date time", "subject",  "sscore", "smean", "svscore", "sdispersion", "svolume", "sbuzz",  "last close", "company name", "date"), row.names = c("1", "2",  "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",  "15", "16", "17", "18", "19", "20"), class = "data.frame") 

note warning maybe want aes(group = 1). i've done add group = 1 aes geom_smooth.

ggplot(data.2) +   geom_jitter(aes(value,svolume, colour=variable),) +    geom_smooth(aes(value,svolume, colour=variable, group = 1), method=lm, se=false) +   facet_wrap(~variable, scales="free_x") +   labs(x = "variables", y = "svolumes") 

some unsolicited advice

here's how write ggplot code:

library(ggplot2) require(reshape2)  data.2 = melt(data[3:9], id.vars='svolume')  ggplot(data.2) +   aes(x = value, y = svolume, colour = variable) +   geom_jitter() +   geom_smooth(method=lm, se=false, aes(group = 1)) +   facet_wrap(~variable, scales="free_x") +   labs(x = "variables", y = "svolumes") 

Comments

Popular posts from this blog

Android : Making Listview full screen -

javascript - Parse JSON from the body of the POST -

javascript - How to Hide Date Menu from Datepicker in yii2 -