Chapter 1 What is the cloud

You may have heard the term cloud computing before, and data scientists often talk about working on the cloud. But what exactly is the cloud?

Cloud storage refers to data or document storage on the Internet rather than on your personal computer. If you take pictures using your phone and then they are backed up on iCloud or Google Photos, you are using the cloud. Using the cloud for storage is like having an external hard drive (portable storage device) that you don’t ever see and can’t actually hold in your hands.

1.0.1 Cloud computing

Cloud computing involves applications and software that run on shared data centers rather than running on the computer sitting in front of you. For data analysis, cloud computing has changed the way we think about working with data, especially when it comes to large datasets. A data analyst no longer needs to spend thousands of dollars to own high-capacity computers to deal with big data because the personal computer no longer has to do all the heavy lifting. Instead, a network of computers (from Amazon, IBM, or Microsoft among many others) will do the work instead. Your local computer will only need to run the interface software, which is often just your Internet browser. In future lessons we will study cloud-based data applications in more detail.

1.0.2 What are the advantages of using the cloud?

A major advantage of the cloud is the ability to access your files everywhere, even if you don’t have your personal computer with you. Since storage and applications work over the Internet, they can be accessed from any computer with an Internet connection.

An important advantage of cloud storage is that your files are safe even if your computer is lost or damaged. Because your files are not stored on your computer itself, they are safe and available even if your computer is stolen, you spill coffee on your keyboard, or there’s a natural disaster in your area. Moreover, most cloud storage services provide back-up services in case you delete files by mistake. These back-up services allow for accidentally deleted files to be recovered and restored.

An advantage of cloud computing is an increase in computing power over what is available on your local machine. Remote machines that are used for cloud computing are more powerful than your personal computer and can do your data analysis much faster.

Finally, working with the cloud puts the responsibility of maintaining software on the service provider rather than on you. When running software locally on your personal computer, you need to maintain applications by making sure that they still work and are up to date with the most recent versions. You must download and install the newest version of the software yourself or possibly even pay for a newer edition. With the cloud, service providers make sure the software is well-maintained and running optimally.

1.0.3 What are the disadvantages of using the cloud?

The most obvious disadvantage of working on the cloud is that you need an Internet connection to access storage and computing power. You cannot work “offline” away from the Internet. However, with wireless Internet service (wifi) available widely in libraries, coffee shops, and other public places, it’s possible to work on the cloud from almost anywhere!

There are also concerns with the privacy and security of data that is stored remotely. Privacy and security are issues that must be addressed by both providers and users of cloud-based services. Service providers need to ensure that the files stored at their data centers are safe and secure. Users need to take advantage of authentication measures and use strong passwords to ensure that no one can gain access to their account. Most major cloud-based service providers do a good job in securing your data. Specifically, their infrastructure is set up so that you can avoid security issues by being serious about protecting your account access information through your choice of password and by using two-factor authentication for logins. Briefly, two-factor authentication is a way of proving your identity to a service provider in two steps. The first step that is by providing a password. The second step involves using a physical object in your possession, such as a phone, to prove your identity. For example, you may also need to enter a code that is sent to your phone during the login process. This means someone would need both your password and physical possession of your phone to access your account. It is good practice to chose two-factor authentication whenever it is offered by a service provider.

1.0.4 Slides and Video

What is the Cloud?