R/Swirl at master · raimundojimenez/R

History

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

341

342

343

344

345

346

347

348

349

350

351

352

353

354

355

356

357

358

359

360

361

362

363

364

365

366

367

368

369

370

371

372

373

374

375

376

377

378

379

380

381

382

383

384

385

386

387

388

389

390

391

392

393

394

395

396

397

398

399

400

401

402

403

404

405

406

407

408

409

410

411

412

413

414

415

416

417

418

419

420

421

422

423

424

425

426

427

428

429

430

431

432

433

434

435

436

437

438

439

440

441

442

443

444

445

446

447

448

449

450

451

452

453

454

455

456

457

458

459

460

461

462

463

464

465

466

467

468

469

470

471

472

473

474

475

476

477

478

479

480

481

482

483

484

485

486

487

488

489

490

491

492

493

494

495

496

497

498

499

500

501

502

503

504

505

506

507

508

509

510

511

512

513

514

515

install.packages("swirl")

library(swirl)

swirl()

install_from_swirl("R Programming")

sudo dnf install R

sudo dnf install R-devel

sudo dnf install libxml2-devel

sudo dnf install libcurl-devel

sudo dnf groupinstall "C Development Tools and Libraries"

Installing package into ‘/usr/lib64/R/library’

(as ‘lib’ is unspecified)

also installing the dependencies ‘memoise’, ‘stringi’, ‘magrittr’, ‘crayon’, ‘jsonlite’,

’mime’, ‘curl’, ‘R6’, ‘bitops’, ‘stringr’, ‘testthat’, ‘httr’, ‘yaml’, ‘RCurl’, ‘digest’

| When you are at the R prompt (>):

| -- Typing skip() allows you to skip the current question.

| -- Typing play() lets you experiment with R on your own; swirl will ignore what you do...

| -- UNTIL you type nxt() which will regain swirl's attention.

| -- Typing bye() causes swirl to exit. Your progress will be saved.

| -- Typing main() returns you to swirl's main menu.

| -- Typing info() displays these options again.

| If at any point you'd like more information on a particular topic related to R, you can type help.start() at the prompt, which will open a menu of resources (either within RStudio

| or your default web browser, depending on your setup). Alternatively, a simple web search often yields the answer you're looking for.

| Anytime you have questions about a particular function, you can access R's built-in help files via the `?` command. For example, if you want more information on the c() function,

| type ?c without the parentheses that normally follow a function name. Give it a try.

The output type is determined from the highest type of the components in the hierarchy NULL < raw < logical < integer < double < complex < character < list < expression. Pairlists are treated as lists, but non-vector components (such names and calls) are treated as one-element lists which cannot be unlisted even if recursive = TRUE.

getwd()

ls() # memoria

list.files()

dir()

?list.files

args(list.files)

dir.create("testdir")

setwd("testdir")

file.create("mytest.R")

file.exists("mytest.R")

file.info("mytest.R")

file.info("mytest.R")$mode

file.rename("mytest.R", "mytest2.R")

file.copy("mytest2.R","mytest3.R")

file.path("mytest3.R")

| You can use file.path to construct file and directory paths that are independent of the operating system your R code is

| running on. Pass 'folder1' and 'folder2' as arguments to file.path to make a platform-independent pathname.

> file.path("folder1", "folder2")

[1] "folder1/folder2"

?dir.create

dir.create(file.path("testdir2", "testdir3"), recursive=TRUE)

unlink("testdir2", recursive = TRUE)

### Sequence of numbers

1:10

pi:10

15:1

?`:`

seq(1,20)

seq(0, 10, by=0.5)

seq(5, 10, length=30)

my_seq <- seq(5, 10, length=30)

length(my_seq)

1:length(my_seq)

seq(along.with = my_seq)

seq_along(my_seq)

rep(0, times=40)

rep(c(0, 1, 2), times = 10)

rep(c(0, 1, 2), each = 10)

### Vectors

| In previous lessons, we dealt entirely with numeric vectors, which are one type of atomic vector. Other types of atomic

| vectors include logical, character, integer, and complex. In this lesson, we'll take a closer look at logical and

| character vectors.

num_vect <- c(0.5, 55, -10, 6)

tf <- num_vect < 1

| If we have two logical expressions, A and B, we can ask whether at least one is TRUE with A | B (logical 'or' a.k.a.

| 'union') or whether they are both TRUE with A & B (logical 'and' a.k.a. 'intersection'). Lastly, !A is the negation of

| A and is TRUE when A is FALSE and vice versa.

my_char <- c("My", "name", "is")

paste(my_char, collapse = " ")

my_name <- c(my_char, "Ray")

paste("Hello", "world!", sep = " ")

> paste(c(1:3), c("X", "Y", "Z"), sep="")

[1] "1X" "2Y" "3Z"

| Vector recycling! Try paste(LETTERS, 1:4, sep = "-"), where LETTERS is a predefined variable in R containing a

| character vector of all 26 letters in the English alphabet.

> paste(LETTERS, 1:4, sep = "-")

[1] "A-1" "B-2" "C-3" "D-4" "E-1" "F-2" "G-3" "H-4" "I-1" "J-2" "K-3" "L-4" "M-1" "N-2" "O-3" "P-4" "Q-1" "R-2" "S-3"

[20] "T-4" "U-1" "V-2" "W-3" "X-4" "Y-1" "Z-2"

### Missing values

x <- c(44, NA, 5, NA)

| To make things a little more interesting, lets create a vector containing 1000 draws from a standard normal

| distribution with y <- rnorm(1000).

> y <- rnorm(1000)

| Next, let's create a vector containing 1000 NAs with z <- rep(NA, 1000).

> z <- rep(NA, 1000)

| Finally, let's select 100 elements at random from these 2000 values (combining y and z) such that we don't know how

| many NAs we'll wind up with or what positions they'll occupy in our final vector -- my_data <- sample(c(y, z), 100).

> my_data <- sample(c(y, z), 100)

| Let's first ask the question of where our NAs are located in our data. The is.na() function tells us whether each

| element of a vector is NA. Call is.na() on my_data and assign the result to my_na.

> my_na <- is.na(my_data)

| So, back to the task at hand. Now that we have a vector, my_na, that has a TRUE for every NA and FALSE for every

| numeric value, we can compute the total number of NAs in our data.

| The trick is to recognize that underneath the surface, R represents TRUE as the number 1 and FALSE as the number 0.

| Therefore, if we take the sum of a bunch of TRUEs and FALSEs, we get the total number of TRUEs.

| Let's give that a try here. Call the sum() function on my_na to count the total number of TRUEs in my_na, and thus the

| total number of NAs in my_data. Don't assign the result to a new variable.

> sum(my_na)

[1] 51

INf - Inf --> NaN

0 / 0 --> NaN

# Subsetting Vectors

x[1:10]

x[is.na(x)]

y <- x[!is.na(x)]

y[y>0]

x[!is.na(x) & x>0]

x[c(3,5,7)]

| Luckily, R accepts negative integer indexes. Whereas x[c(2, 10)] gives us ONLY the 2nd and 10th elements of x, x[c(-2,

| -10)] gives us all elements of x EXCEPT for the 2nd and 10 elements. Try x[c(-2, -10)] now to see this.

> x[c(-2, -10)]

| Create a numeric vector with three named elements using vect <- c(foo = 11, bar = 2, norf = NA).

> vect <- c(foo = 11, bar = 2, norf = NA)

names(vect)

vect2 <- c(11, 2, NA)

names(vect2) <- c("foo", "bar", "norf")

> identical(vect, vect2)

[1] TRUE

vect["bar"]

# 7: Matrices and Data Frames

| In this lesson, we'll cover matrices and data frames. Both represent 'rectangular' data types, meaning that they are

| used to store tabular data, with rows and columns.

| The main difference, as you'll see, is that matrices can only contain a single class of data, while data frames can

| consist of many different classes of data.

my_vector <- 1:20

> dim(my_vector)

NULL

> length(my_vector)

[1] 20

dim(my_vector) <- c(4, 5)

> dim(my_vector)

[1] 4 5

> attributes(my_vector)

$dim

[1] 4 5

> my_vector

[,1] [,2] [,3] [,4] [,5]

[1,] 1 5 9 13 17

[2,] 2 6 10 14 18

[3,] 3 7 11 15 19

[4,] 4 8 12 16 20

> class(my_vector)

[1] "matrix"

my_matrix <- my_vector

?matrix

my_matrix2 <- matrix(1:20, 4, 5)

> identical(my_matrix, my_matrix2)

[1] TRUE

patients <- c("Bill", "Gina", "Kelly", "Sean")

cbind(patients, my_matrix)

> cbind(patients, my_matrix)

patients

[1,] "Bill" "1" "5" "9" "13" "17"

[2,] "Gina" "2" "6" "10" "14" "18"

[3,] "Kelly" "3" "7" "11" "15" "19"

[4,] "Sean" "4" "8" "12" "16" "20"

my_data <- data.frame(patients, my_matrix)

> my_data

patients X1 X2 X3 X4 X5

1 Bill 1 5 9 13 17

2 Gina 2 6 10 14 18

3 Kelly 3 7 11 15 19

4 Sean 4 8 12 16 20

> class(my_data)

[1] "data.frame"

cnames <- c("patient", "age", "weight", "bp", "rating", "test")

colnames(my_data) <- cnames

> my_data

patient age weight bp rating test

1 Bill 1 5 9 13 17

2 Gina 2 6 10 14 18

3 Kelly 3 7 11 15 19

4 Sean 4 8 12 16 20

ints <- sample(10)

> ints

[1] 8 7 2 1 5 6 9 10 3 4

> ints > 5

[1] TRUE TRUE FALSE FALSE FALSE TRUE TRUE TRUE FALSE FALSE

> which(ints > 7)

[1] 1 7 8

| Like the which() function, the functions any() and all() take logical vectors as their argument. The any() function

| will return TRUE if one or more of the elements in the logical vector is TRUE. The all() function will return TRUE if

| every element in the logical vector is TRUE.

Sys.Date()

| The mean() function takes a vector of numbers as input, and returns the average of all of the numbers in the input

| vector. Inputs to functions are often called arguments. Providing arguments to a function is also sometimes called

| passing arguments to that function. Arguments you want to pass to a function go inside the function's parentheses. Try

| passing the argument c(2, 4, 5) to the mean() function.

| To understand computations in R, two slogans are helpful: 1. Everything that exists is an object. 2. Everything that

| happens is a function call.

sum(my_vector)/length(my_vector)

Modulus : num %% divisor

---

# You can pass functions as arguments to other functions just like you can pass

# data to functions. Let's say you define the following functions:

# add_two_numbers <- function(num1, num2){

# num1 + num2

# }

# multiply_two_numbers <- function(num1, num2){

# num1 * num2

# }

# some_function <- function(func){

# func(2, 4)

# }

# As you can see we use the argument name "func" like a function inside of

# "some_function()." By passing functions as arguments

# some_function(add_two_numbers) will evaluate to 6, while

# some_function(multiply_two_numbers) will evaluate to 8.

# Finish the function definition below so that if a function is passed into the

# "func" argument and some data (like a vector) is passed into the dat argument

# the evaluate() function will return the result of dat being passed as an

# argument to func.

# Hints: This exercise is a little tricky so I'll provide a few example of how

# evaluate() should act:

# 1. evaluate(sum, c(2, 4, 6)) should evaluate to 12

# 2. evaluate(median, c(7, 40, 9)) should evaluate to 9

# 3. evaluate(floor, 11.1) should evaluate to 11

evaluate <- function(func, dat){

# Write your code here!

# Remember: the last expression evaluated will be returned!

func(dat)

}

evaluate(function(x){x+1}, 6)

evaluate(sd, c(1.4, 3.6, 7.9, 8.8))

evaluate(function(x){x[1]}, c(8, 4, 0))

---

Standard Deviation : sd

?paste

paste("Programming", "is", "fun!")

# The ellipses can be used to pass on arguments to other functions that are

# used within the function you're writing. Usually a function that has the

# ellipses as an argument has the ellipses as the last argument. The usage of

# such a function would look like:

# ellipses_func(arg1, arg2 = TRUE, ...)

# In the above example arg1 has no default value, so a value must be provided

# for arg1. arg2 has a default value, and other arguments can come after arg2

# depending on how they're defined in the ellipses_func() documentation.

# Interestingly the usage for the paste function is as follows:

# paste (..., sep = " ", collapse = NULL)

# Notice that the ellipses is the first argument, and all other arguments after

# the ellipses have default values. This is a strict rule in R programming: all

# arguments after an ellipses must have default values. Take a look at the

# simon_says function below:

# simon_says <- function(...){

# paste("Simon says:", ...)

# }

# The simon_says function works just like the paste function, except the

# begining of every string is prepended by the string "Simon says:"

# Telegrams used to be peppered with the words START and STOP in order to

# demarcate the beginning and end of sentences. Write a function below called

# telegram that formats sentences for telegrams.

# For example the expression `telegram("Good", "morning")` should evaluate to:

# "START Good morning STOP"

telegram <- function(...){

paste("START", ..., "STOP")

}

---

# Let's explore how to "unpack" arguments from an ellipses when you use the

# ellipses as an argument in a function. Below I have an example function that

# is supposed to add two explicitly named arguments called alpha and beta.

# add_alpha_and_beta <- function(...){

# # First we must capture the ellipsis inside of a list

# # and then assign the list to a variable. Let's name this

# # variable `args`.

# args <- list(...)

# # We're now going to assume that there are two named arguments within args

# # with the names `alpha` and `beta.` We can extract named arguments from

# # the args list by used the name of the argument and double brackets. The

# # `args` variable is just a regular list after all!

# alpha <- args[["alpha"]]

# beta <- args[["beta"]]

# # Then we return the sum of alpha and beta.

# alpha + beta

# }

# Have you ever played Mad Libs before? The function below will construct a

# sentence from parts of speech that you provide as arguments. We'll write most

# of the function, but you'll need to unpack the appropriate arguments from the

# ellipses.

mad_libs <- function(...){

# Do your argument unpacking here!

args <- list(...)

place <- args[['place']]

adjective <- args[['adjective']]

noun <- args[['noun']]

# Don't modify any code below this comment.

# Notice the variables you'll need to create in order for the code below to

# be functional!

paste("News from", place, "today where", adjective, "students took to the streets in protest of the new", noun, "being installed on campus.")

}

---

# The syntax for creating new binary operators in R is unlike anything else in

# R, but it allows you to define a new syntax for your function. I would only

# recommend making your own binary operator if you plan on using it often!

# User-defined binary operators have the following syntax:

# %[whatever]%

# where [whatever] represents any valid variable name.

# Let's say I wanted to define a binary operator that multiplied two numbers and

# then added one to the product. An implementation of that operator is below:

# "%mult_add_one%" <- function(left, right){ # Notice the quotation marks!

# left * right + 1

# }

# I could then use this binary operator like `4 %mult_add_one% 5` which would

# evaluate to 21.

# Write your own binary operator below from absolute scratch! Your binary

# operator must be called %p% so that the expression:

# "Good" %p% "job!"

# will evaluate to: "Good job!"

"%p%" <- function(left, right){ # Remember to add arguments!

paste(left, right)

}

---

http://archive.ics.uci.edu/ml/datasets/Flags

head(flags)

dim(flags)

class(flags)

| The lapply() function takes a list as input, applies a function to each element of the list, then returns a list of the

| same length as the original one. Since a data frame is really just a list of vectors (you can see this with

| as.list(flags)), we can use lapply() to apply the class() function to each column of the flags dataset. Let's see it in

| action!

cls_list <- lapply(flags, class)

class(cls_list)

| You may remember from a previous lesson that lists are most helpful for storing multiple classes of data. In this case,

| since every element of the list returned by lapply() is a character vector of length one (i.e. "integer" and "vector"),

| cls_list can be simplified to a character vector. To do this manually, type as.character(cls_list).

as.character(cls_list)

| sapply() allows you to automate this process by calling lapply() behind the scenes, but then attempting to simplify

| (hence the 's' in 'sapply') the result for you. Use sapply() the same way you used lapply() to get the class of each

| column of the flags dataset and store the result in cls_vect. If you need help, type ?sapply to bring up the

| documentation.

cls_vect <- sapply(flags, class)

| Therefore, if we want to know the total number of countries (in our dataset) with, for example, the color orange on

| their flag, we can just add up all of the 1s and 0s in the 'orange' column. Try sum(flags$orange) to see this.

> sum(flags$orange)

flag_colors <- flags[, 11:17]

head(flag_colors)

lapply(flag_colors, sum)

sapply(flag_colors, sum)

sapply(flag_colors, mean)

flag_shapes <- flags[, 19:23]

| The range() function returns the minimum and maximum of its first argument, which should be a numeric vector. Use

| lapply() to apply the range function to each column of flag_shapes. Don't worry about storing the result in a new

| variable. By now, we know that lapply() always returns a list.

> lapply(flag_shapes, range)

$circles

[1] 0 4

$crosses

[1] 0 2

$saltires

[1] 0 1

$quarters

[1] 0 4

$sunstars

[1] 0 50

> sapply(flag_shapes, range)

circles crosses saltires quarters sunstars

[1,] 0 0 0 0 0

[2,] 4 2 1 4 50

| When given a vector, the unique() function returns a vector with all duplicate elements removed. In other words,

| unique() returns a vector of only the 'unique' elements. To see how it works, try unique(c(3, 4, 5, 5, 5, 6, 6)).

unique_vals <- lapply(flags, unique)

unique_vals

sapply(unique_vals, length)

lapply(unique_vals, function(elem) elem[2])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Swirl

Swirl

Files

Swirl

Latest commit

History

Swirl

File metadata and controls