Simple Summary of Basic Knowledge of Node.js

Node.js from2009Since its birth two years ago, it has developed for more than two years, and its growth speed is evident to all. From the number of visits on GitHub surpassing Rails, to the founder of Node.js, Ryan Dalh, joining Joyent and obtaining corporate funding at the end of last year, to the release of the Windows移植 version this year, the prospects of Node.js have been confirmed by the technical community. InfoQ has been following the development of Node.js, with special lectures at the two Qcon conferences this year (Beijing and Hangzhou). In order to better promote the technology of Node.js in China, we have decided to open a column named 'Deep Dive into Node.js', inviting preachers, developers, and technical experts from the Node.js field to discuss various aspects of Node.js, allowing readers to have a deeper understanding of Node.js and actively participate in discussions and practices of new technologies.

The first article of this column, "What is Node.js", tries to elaborate on the basic concepts, development history, advantages, etc. of Node.js from various perspectives. Developers who are not familiar with this field can understand some basic knowledge of Node.js through this article.

Let's talk about the name

There are more and more technical reports about Node.js, and the writing style of Node.js is also diverse, some write it as NodeJS, some as Nodejs, which writing style is the most standard? Let's follow the official statement. On the official website of Node.js, it has always been called "Node" or "Node.js", and no other names have been found. "Node" is used the most, considering that the word "Node" is too broad in meaning and usage, which is easy to mislead developers, we adopted the second name - "Node.js", the suffix .js points out the original intention of the Node project, and other names are diverse without a clear origin, we do not recommend using them.

Node.js is not a JS application, but a JS runtime platform

Upon seeing the name Node.js, beginners may mistakenly think it is a JavaScript application. In fact, Node.js uses C++language to write it, which is a JavaScript runtime environment. Why choose C++language was chosen. According to Ryan Dahl, the founder of Node.js, he initially hoped to write Node.js in Ruby, but later found that the performance of the Ruby virtual machine could not meet his requirements. Later, he tried to use V8engine, so C++language. Since it's not a JavaScript application, why is it called .js? Because Node.js is a JavaScript runtime environment. When mentioning JavaScript, people often think of the browsers we use every day, modern browsers include various components, including rendering engines, JavaScript engines, etc., among which the JavaScript engine is responsible for interpreting and executing JavaScript code in web pages. As one of the most important languages for web front-end, JavaScript has always been the exclusive territory of front-end engineers. However, Node.js is a JavaScript runtime environment for the back-end (supported systems include*Node.js (such as Linux, Windows), which means you can write system-level or server-side JavaScript code and let Node.js interpret and execute it, simple commands are like:

#node helloworld.jsNode.js

Adopted the V of Google Chrome browser8The engine has good performance and also provides many system-level APIs, such as file operations, network programming, etc. The JavaScript code on the browser side is subject to various security restrictions during execution and has limited operations on the customer system. In comparison, Node.js is a comprehensive backend runtime that provides many functions that other languages can implement.

Node.js adopts event-driven and asynchronous programming, designed for network services

The term 'event-driven' is not unfamiliar. In certain traditional language network programming, we may use callback functions, such as when a socket resource reaches a certain state, the registered callback function will be executed. The design philosophy of Node.js is centered around event-driven, with the vast majority of its APIs being event-based and asynchronous in style. Taking the Net module as an example, the net.Socket object in it has the following events: connect, data, end, timeout, drain, error, close, etc. Developers using Node.js need to register the corresponding callback functions according to their business logic. These callback functions are executed asynchronously, which means that although these functions seem to be registered sequentially in the code structure, they do not depend on the order of their appearance, but wait for the triggering of the corresponding events. The design of event-driven and asynchronous programming (interested readers can refer to the author's other article 'The Asynchronous Programming Style of Node.js') has the important advantage of fully utilizing system resources, executing code without blocking and waiting for an operation to complete, and limited resources can be used for other tasks. This design is very suitable for backend network service programming, and Node.js aims to achieve this. In server development, handling concurrent requests is a big problem, and blocking functions can lead to resource waste and time delay. By using event registration and asynchronous functions, developers can improve resource utilization and improve performance.

From the support modules provided by Node.js, we can see that many functions, including file operations, are executed asynchronously, which is different from traditional languages. Moreover, to facilitate server development, Node.js has a large number of network modules, including HTTP, DNS, NET, UDP, HTTPS, TLS, etc., allowing developers to quickly build web servers. Taking the simple helloworld.js as an example:

var http = require('http');
http.createServer(function (req, res) {
  res.writeHead(200, {'Content-Type': 'text/
  res.end('Hello World\n');
}).listen(80, "127.0.0.1");

The above code builds a simple HTTP server (the running example is deployed in http://helloworld.cnodejs.net/ Readers can visit), and locally listen80 port, for any HTTP request, the server returns a header status code of200, Content-Type' value is text/From this small example, we can see several points:

Node.js is quite convenient for network programming, providing modules (in this case, http) that open easy-to-use API interfaces, and just a few lines of code can build a server.

It embodies event-driven and asynchronous programming, specifying a callback function in the parameters of the createServer function (implemented by JavaScript anonymous functions), and when an HTTP request is sent, Node.js will call this callback function to handle the request and response. Of course, this example is relatively simple, with no too many event registrations. In future articles, readers will see more practical examples.

Characteristics of Node.js

Next, let's talk about the characteristics of Node.js. The characteristics of event-driven and asynchronous programming have been detailedly explained earlier, so there is no need to repeat them here.

Node.js has good performance. According to the founder Ryan Dahl, performance is an important factor considered by Node.js, choosing C++and V8Instead of Ruby or other virtual machines that are also based on performance considerations, Node.js is also designed to be bold. It runs in a single-process, single-threaded mode (surprising, isn't it? This is consistent with the way JavaScript runs), and the event-driven mechanism is implemented by Node.js through an internal single-threaded high-efficiency event loop queue, without the resource consumption and context switching of multi-threading. This means that in the face of large-scale HTTP requests, Node.js can handle everything with event-driven mechanisms. Network service developers accustomed to traditional languages may be very familiar with multi-threaded concurrency and collaboration, but when facing Node.js, we need to accept and understand its characteristics. From this, can we infer that such a design will lead to the load pressure concentrated on the CPU (event loop processing?) rather than memory (do you remember the days when Java virtual machines threw OutOfMemory exceptions?), seeing is believing, let's take a look at the performance test of Node.js by the Taobao Shared Data Platform team:

Physical machine configuration: RHEL 5.2, CPU 2.2GHz, memory4G

Node.js application scenarios: MemCache proxy, each time taking100 bytes of data

Connection pool size:50

Number of concurrent users:100

Test results (socket mode): memory (30M), QPS (16700), CPU (95%)

From the results above, we can see that under such a test scenario, the qps can reach16700 times, memory usage is only30M (where V8heap usage22M), and the CPU reaches95%, which may become a bottleneck. In addition, many practitioners have done performance analysis on Node.js, and overall, its performance is convincing and is also an important reason for its popularity. Since Node.js adopts a single-process, single-threaded model, how can a Node.js with excellent single-core performance utilize multi-core CPUs in today's environment where multi-core hardware is popular? Founder Ryan Dahl suggests running multiple Node.js processes and using certain communication mechanisms to coordinate tasks. Currently, there are many third-party Node.js multi-process support modules released, and the following articles in the column will detail the programming of Node.js under multi-core CPUs.

Another characteristic of Node.js is that it supports programming languages such as JavaScript. The discussion of the advantages and disadvantages of dynamic and static languages is not expanded here. Just three points:

var hostRequest = http.request(requestOptions,function(response) {
  var responseHTML ='';
  response.on('data', function (chunk) {
    responseHTML = responseHTML + chunk;
  });
  response.on('end',function(){
    console.log(responseHTML);
    // do something useful
  });
});

In the above code, we need to handle the responseHTML variable in the end event. Due to the closure feature of JavaScript, we can define the responseHTML variable outside the two callback functions, and then continuously modify its value in the callback function corresponding to the data event, and finally access and process it in the end event.

As the main language of front-end engineers, JavaScript has considerable influence in the technical community. Moreover, with the continuous development of Web technology, especially the increasing importance of the front end, many front-end engineers have started to try their hand at 'backend applications'. In many enterprises that use Node.js, engineers have said that they choose Node.js because they are accustomed to JavaScript.

The anonymous function and closure features of JavaScript are very suitable for event-driven and asynchronous programming. From the helloworld example, we can see that callback functions are implemented in the form of anonymous functions, which is very convenient. The role of closure is even greater, see the following code example:

JavaScript has good performance in dynamic languages, and some developers have done performance analysis on dynamic languages such as JavaScript, Python, Ruby, and found that JavaScript has better performance than other languages, plus V8The engine is also an excellent representative of its kind, so Node.js benefits from its performance.

Brief history of Node.js development

2009Year2In the month, Ryan Dahl announced on his blog that he was preparing to base V8Create a lightweight web server and provide a set of libraries.

2009Year5In the month, Ryan Dahl released the initial version of some Node.js packages on GitHub, and in the following months, some people began to develop applications with Node.js.

2009Year11And2010Year4In the month, both JSConf conferences scheduled lectures on Node.js.

2010At the end of the year, Node.js received funding from cloud computing service provider Joyent, and founder Ryan Dahl joined Joyent full-time to be responsible for the development of Node.js.

2011Year7In the month, Node.js released the Windows version with the support of Microsoft.

Node.js application cases

Although Node.js was born just over two years ago, its development momentum is gradually catching up with Ruby/Rails, we list some cases of enterprises using Node.js here, and listen to the voices of customers.

In the latest mobile app released by the social networking site LinkedIn, NodeJS is the backend foundation of this mobile application. Kiran Prasad, LinkedIn's mobile development director, told the media that the entire mobile software platform is built with NodeJS:

LinkedIn internally uses a large number of technologies, but when it comes to mobile servers, we are completely based on Node.

(Reasons for using it) First, because of its flexibility. Second, if you understand Node, you will find that it is best at communicating with other services. Mobile applications must interact with our platform API and database. We haven't done much data analysis. Compared to the Ruby on Rails technology used previously, the development team found that Node has greatly improved in performance. They ran on each physical machine}}15virtual servers (15instances), where4instance can handle double the traffic. The capacity assessment is based on the results of load testing.

The enterprise social service website Yammer used Node to create cross-domain proxy servers for its own platform, and third-party developers can implement AJAX communication between their own domain-hosted JavaScript code and the Yammer platform API through this server. Jim Patterson, the technical director of the Yammer platform, expressed his own views on the advantages and disadvantages of Node:

(Advantages) Because Node is event-driven and non-blocking, it is very suitable for handling concurrent requests, so the proxy server built on Node performs much better than other technical implementations (such as Ruby) of servers. In addition, the client code that interacts with the Node proxy server is written in JavaScript, so both the client and server sides are written in the same language, which is very wonderful.

(Disadvantages) Node is a relatively new open-source project, so it is not very stable. It is always changing, and it lacks enough third-party library support. It seems like Ruby/Rails looked like back then.

the well-known project hosting website GitHub also tried Node applications. This Node application is called NodeLoad, which is an archive download server (it is used every time you download a tarball or zip file of a stored branch). GitHub's previous archive download server was written in Ruby. In the old system, the request to download an archive would create a Resque task. This task actually runs a git archive command on the archive server, fetching data from a file server. Then, the initial request is assigned to a small Ruby Sinatra application to wait for the task. It is actually just checking if the memcache flag exists and then redirecting to the final download address. The old system ran for about3Sinatra instance and3a Resque worker. GitHub developers believe this is a great opportunity for Node applications. Node is event-driven and, compared to Ruby's blocking model, Node can handle git archives better. During the process of writing the new download server, developers found that Node is very suitable for this function, and they also made use of the Node library socket.io to monitor the download status.

Not only abroad, but Node's advantages have also attracted the attention of domestic developers. Taobao has actually applied Node technology:

MyFOX is a data processing middleware responsible for extracting data from a MySQL cluster, performing calculations, and outputting statistical results. Users submit a SQL statement, and MyFOX generates the query statements required for each database shard based on the semantics of the SQL command, sends them to each shard, and then summarizes and calculates the results. The characteristics of MyFOX are CPU-intensive, no file IO, and only handles read-only data. Initially, MyFOX was written in PHP, but encountered many problems. For example, PHP is single-threaded, and MySQL requires blocking queries, making it difficult to concurrently request data. The later solution was to use nginx and dirzzle, and implement the interface based on the HTTP protocol, and use curl_multi_get to perform requests. However, the MyFOX project team ultimately decided to implement MyFOX using Node.js.

There are many reasons to choose Node.js, such as considering interest and community development, and also hoping to improve concurrency and fully utilize the CPU. For example, frequently opening and closing connections will keep a large number of ports in a waiting state. When the number of concurrent connections increases, connection failures often occur due to insufficient ports (in TIME_WAIT state). In the past, system settings were often modified to reduce waiting time to bypass this error. However, using a connection pool can effectively solve this problem. In addition, MyFOX would experience very dense access pressure under certain cache failure conditions. Using Node.js can share query states, allowing some requests to "wait for a moment" so that the system can replenish the cache content.

Summary

This article briefly introduces the basic knowledge of Node.js, including concepts, characteristics, history, cases, and so on. As a newly2As a platform that has been around for many years, the development momentum of Node.js is evident. More and more enterprises are beginning to pay attention to and try Node.js, and front-end and back-end developers should understand the relevant content.

Basic Tutorial