Читать книгу «IT Cloud» онлайн полностью📖 — Eugeny Shtoltc — MyBook.
image

Eugeny Shtoltc
IT Cloud

Prologue

More than 70 (76) tools are considered in practice in the book:

* Google Cloud Platform, Amazone WEB Services, Microsoft Azure;

* console utilities: cat, sed, NPM, node, exit, curl, kill, Dockerd, ps, sudo, grep, git, cd, mkdir, rm, rmdir, mongos, Python, df, eval, ip, mongo, netstat, oc, pgrep, ping, pip, pstree, systemctl, top, uname, VirtualBox, which, sleep, wget, tar, unzip, ls, virsh, egrep, cp, mv, chmod, ifconfig, kvm, minishift;

* standard tools: NGINX, MinIO, HAProxy, Docker, Consul, Vagrant, Ansible, kvm;

* DevOps tools: Jenkins, GitLab CI, BASH, PHP, Micro Kubernetes, kubectl, Velero, Helm, "http load testing";

* cloud Traefic, Kubernetes, Envoy, Istio, OpenShift, OKD, Rancher ,;

* several programming languages: PHP, NodeJS, Python, Golang.

Containerization

Infrastructure development history

Limoncelli (author of "The Practice of Cloud System Administration"), who worked for a long time at Google Inc, believes that 2010 is the year of transition from the era of the traditional Internet to the era of cloud computing.

* 1985-1994 – the time of mainframes (large computers) and intra-corporate data exchange, in which you can easily plan the load

* 1995-2000 – the era of the emergence of Internet companies,

* 2000-2003

* 2003-2010

* 2010-2019

The increase in the productivity of a separate machine is less than the increase in cost, for example, an increase in productivity by 2 times leads to an increase in cost significantly more than 2 times. At the same time, each subsequent increase in productivity is much more expensive. Consequently, each new user was more expensive.

Later, in the period 2000-2003, an ecosystem was able to form, providing a fundamentally different approach:

* the emergence of distributed computing;

* the emergence of low-power mass equipment;

* maturation of OpenSource solutions, allowing you to install software on multiple machines, not bundled with a processor license;

* maturation of telecommunication infrastructure;

* increasing reliability due to the distribution of points of failure;

* the ability to increase performance if needed in the future by adding new components.

The next stage was unification, which was most pronounced in 2003-2010:

* providing in the data center not a place in the closet (power-location), but already unified hardware purchased in bulk for the whole cent;

* saving on resources;

* virtualization of the network and computers.

Amazon set another milestone in 2010 and ushered in the era of cloud computing. The stage is characterized by the construction of large-scale data cents with a deliberate surplus in capacity to obtain a lower cost of computing power due to wholesale, based on savings for oneself and a profitable sale of their surplus at retail. This approach is applied not only to computing power, infrastructure, but also software, forming it as services to reduce the cost of their use by selling them at retail to both large companies and beginners.

The need for uniformity of the environment

Usually, novice Linux developers prefer to work from under Windows, so as not to learn an unfamiliar OS and stuff their own cones on it, because before everything was far from so simple and so debugged. Often, developers are forced to work from under Windows because of corporate preferences: 1C, Directum and other systems run only on Windows, and the rest, and most importantly the network infrastructure, is tailored for this operating system. Working from Windows leads to a large loss of working time for both developers and DevOps on fixing both minor and major differences in operating systems. These differences start to show up from the simplest tasks, for example, that it may be easier to make a page in pure HTML. But an incorrectly configured editor will put in the BOM and line feeds accepted in Windows: "\ n \ r" instead of "\ n"). BOM, when gluing the header, body and footer of the page, will create indents between them, they will not be visible in the editor, since these are formed by bytes of meta information about the file type, which in Linux do not have such a meaning and are perceived as translation of the indentation. Other newlines in GIT do not allow you to see the difference you made, because the difference is on each line.

Now let's take the Front developer. At first glance, what is difficult, because JS (JavaScript), HTML and CSS are interpreted natively by the browser. Previously, the layout of all different pages was done – it was checked by the designer and the customer and was given to the PHP developer for integration with the framework or cms. In order not to edit the header on each page, and then find out for a long time when they started to differ and which one was more correctly used by HAML. HAML adds additional syntax to HTML to avoid boiling: loops, file connections, in our case, a single header and footer. But it requires a special program to transform HTML into pure HTML. In MS Windows, this is solved by installing the compiler program and connecting it to the IDE, since all these features are in the IDE WEBStorm. With CSS and its size, duplicates, dependencies and support for different browsers, everything is much more complicated – LESS was used there, and now he headed the more functional SASS and libraries of support for different browsers, which requires the RUBY compiler and such a bundle usually does not work the first time. And for JS, CoffeScript was used. All this needs to be run through compression and validation programs (HTML validation is usually not needed).

When the project starts to grow and ceases to be separate pages with "JS inserts", and becomes SPA (Single Page Application, one page web applications), where everything is created by JS, and already collectors (Galp, Grunt), package managers and NodeJS are not assembled, the difficulties are becoming more and more. All these programs are free and were originally developed for Linux, designed to work from the BASH console and under Windows do not always behave well and are difficult to automate in graphical interfaces, despite the efforts of the IDE developers. Therefore, many WEB developers have switched from MS Windows to MacOS, which is a fork of UNIX systems, BASH is natively built into it.

Docker as lightweight virtual machines

Initially, the problem of isolation of provisions and projects was solved by virtualization – system software that emulates at a certain level the environment, which can be hardware (a computer as a set of components such as a processor, RAM, network device and others, if necessary) or, less often, operating system. The system administrator chooses the amount of RAM (no more free), processor, network device. Installs the operating system and, if necessary, drivers, installs the necessary programs. If you need a workplace for a second developer, he does the same. To install programs, it looks into the / bin directory of the first one and installs the missing ones. And here the first quiet problem arises, which has not yet manifested itself, that the programs are installed in different versions, but this will be a headache for developers, if one developer has a lot of work, and the other does not, or a headache for this sysadmin – if the developer works, in production – not.

With the increase in the number of jobs, the following problems are added:

* Less than 30% of the performance of the parent system is available to you, because all the commands that the processor must execute are executed by the virtualization program. To increase performance, the VT-X processor mode allows the processor, in which the processor directly executes commands from the virtual environment, and in cases of incompatibility, it throws an exception. True, these throws are hundreds of times more expensive than ordinary commands, so adult virtualization systems (VirtualBox, VMVare, and others) try to filter and modify potentially incompatible commands, which can significantly increase performance.

* Each workstation has to be created anew, for this the system administrator writes a script that automates this process, but it is naturally not ideal, and coming to constantly update it, make patches of incompatibilities that were installed by the programmers.

* A linear increase in the occupied disk space from the number of containers, and an exponential increase from product versions, despite the fact that one instance takes up a lot of space. That is, each sandbox contains a program emulation instance, an operating system image, all installed programs and developer code, which is quite a lot. The only thing that is one thing is the installation of the virtual machine itself, and then only within the framework of one physical server. For example, if we have 10 developers, then the size will be 10 times larger, with 3 product versions – 30.

All these disadvantages for the WEB are intended to be solved by Docker. Hereinafter, we will talk about Docker for Linux, and will not consider the slightly different kernel implementation for FreeBSD, and the implementation for Windows 10 Professional, within which either a stripped-down Linux kernel or independent development of Windows containerization is purchased. The main idea is not to create a layer (virtualization of hardware, its own OS and hypervisor), but to differentiate rights. You do not multiply to put in the MS Windows container, but you can put both RedHut and Debian, since the kernel is one, and the differences are in the files, creating a sandbox (separate directories and prohibiting going beyond its chapels) with these files. Also, we are talking about WEB solutions, since for native solutions, problems may arise when the program needs to have exclusive access from the container (Docker sandbox) to the OS kernel, for example, for native rendering of windows. You can also limit the amount of memory, processor time, number of processes.

Lightweight Virtualization or Lightweight Isolation – A Look at Docker Implementation

Let's take a look at the history of the appearance of the prerequisites for the emergence of Docker, namely the prerequisites, since Docker itself does not implement isolation, let alone virtualization, but organizes work with it from the first. Unlike virtualization, which resembles a hangar with its own world and its own foundation, on which you can impose whatever your heart desires, for example, we give out a lawn, then for isolation you can draw an analogy with a fence. Isolation appeared in the Linux kernel gradually, in parts responsible for different levels, and in parallel, programs appeared to provide an interface and the concept of applying this isolation in real projects. Isolation consists of 6 types of resource limitation.

The first in the kernel was the isolation of the file system, which allows you to create a sandbox using the chroot command back in 1979, from outside the sandbox is completely visible, but when you go inside the folder over which the command is executed becomes root, and you will not be able to return. The next one was the delimitation of processes, so the sandbox exists and the host system as long as the process with pid (number) 1 exists. For the sandbox it is its own, outside the sandbox it is a normal process. Further, the steel distinctions of CGroups were tightened: user groups, memory and others. All this exists in the kernel of any Linux, regardless of whether you have Docker installed or not. Throughout history, attempts were made, OpenSource and commercial, to create containers, developing the functionality themselves, and similar solutions found their users, but they did not penetrate the masses. Docker at the beginning of its existence used a fairly stable but difficult to use LXC containerization solution. Gradually he replaced LXC with native CGroup. Docker also supports the salinity of its image (more on that later), but does not implement it itself, but uses UnuonFS (UFS).

Docker and disk space

Since Docker does not implement functionality, but uses the built-in Linux kernel, and does not have a graphical interface under the hood, it itself takes up very little space.

Since the container uses the host OS kernel, the base image (usually the OS) contains only complementary packages. So the Debian Docker image is 125Mb, and the ISO image is 290Mb. To check that one core is used in the container, we will display information about it: uname -a or cat / proc / version , and information about the container environment itself cat / etc / ussue

Docker builds an image based on the instructions in the Dockerfile, which can be located remotely or locally, and can be created from it at any time. Therefore, if you are not using the image at the moment, then you can delete it. An exception is the image created from the container using the Docker commit command , but it is not very correct to create this way, and you can always select from the Dockerfile image with the Docker history command and delete the image. The advantage of storing images is that you do not need to wait while it is being created: the OS and libraries are downloaded.

Docker itself uses an image called Image, which is built based on the instructions in the Dockerfile. When you create several containers on its basis, the space practically does not increase, since the container is just a process and config settings. When changing files in the container, the files themselves are not saved, but the changes made are saved, which will be deleted after the container is transferred. This guarantees in 99% of cases a completely identical environment, and as a result, it is not important to place preparatory operations common for all containers for installing specific programs in the image, a side effect of which is the absence of their duplication. To be able to save data, folders and files are mounted to the host (parent) system. Therefore, you can run a hundred or more containers on a regular computer, and you will not see any changes in the free local on the disk. At the same time, if the developers use git, and how can they do without it, and they often accumulate, then there may be no need to mount the folders with the source code.

The docker image is not a monolithic image of your product, but a layer cake of images, the layers of which are cached. This allows you to significantly save time for creating an image. Caching can be disabled with the build –no-cache = true command switch if Docker does not recognize that the data is mutable. Docker can see the changes in the ADD statement adding a file from the host system to the container by the hash of the file. So, if you create two containers, one with NGINX and the other with MySQL, both of which are based on Ubuntu 14.04, there will be three image layers: MySQL, NGINX, and Ubuntu. The images can be viewed with the Docker history command . It also works for your projects – when copying 2 versions of the code to your image using the ADD command with your product, you will have 3 layers and two images: the base one, with the code of the first version and the code of the second version, regardless of the number of containers. The number of layers is limited to 127. It is important to note that when cloning a project, you need to specify the version, not just git clone , but git clone –branch v1 and git clone –branch v2 , otherwise Docker will cache the layer created by the Git Clone command and when creating the second we get the same image.

Docker does not consume resources, but only limits them if it is specified in the settings when creating the container (for memory, the key is m, for the processor – c). Since Docker supports different containerization filesystems, there is no unified interface to customize. But, in any case, the resource is consumed as much as required, and not as much as allocated, as in virtual machines.

Such concern about the occupied disk space and the weightlessness of the containers themselves entails irresponsibility in downloading images and creating containers.

Garbage collection by container

Due to the fact that the container provides much more opportunities than a virtual machine, the situation is complicated by leaving garbage after Docker is running. The problem is solved simply by running the moore collector, which appeared in version 1.13 or more difficult for earlier versions by writing a script you need.

Just as simple to create a docker container run name_image , it is also simple to delete docker rm -f id_container . Often, in order to just experiment, it is convenient to run the container interactively docker run -ti name_image bash and we will immediately find ourselves in the container. When we exit it with Cntl + D , it will be stopped. To automatically remove the output field, use the –rm parameter . But because containers are so weightless, they are so easy to create, they are often thrown and not removed, which leads to their explosive growth. You can look at the running ones with the docker ps command , and at the stopped ones – docker ps -a . To prevent this, use the docker containers prune garbage collector , which was introduced in version 1.13 and which will remove all stopped containers. For earlier versions, use the docker rm $ script (docker ps -q -f status = exited) . If its launch is not desirable for you, most likely you are using Docker incorrectly, since it is almost as quick and easy to pull a container from an image as to restore it to work. If you need to save state in the container, then this is done by mounting folders or volumes.

На этой странице вы можете прочитать онлайн книгу «IT Cloud», автора Eugeny Shtoltc. Данная книга имеет возрастное ограничение 12+, относится к жанру «Зарубежная компьютерная литература». Произведение затрагивает такие темы, как «изучение компьютера», «самиздат». Книга «IT Cloud» была написана в 2021 и издана в 2021 году. Приятного чтения!