You should add the following to the nginx.conf tcp_nopush on;</

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

so <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Nginx： <div class="snippet-clipboard-content notranslate position-relative overflo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

nginx: [webbench-1.5]# ./webbench -c 500 -t 60 <a href="http://xxx.xxx.xxx.xxx:807

Misleading benchmark,about h2o/h2o

Comments (27)

xiangzhai commented on June 18, 2024 1

Hi @kazuho

What impressed me is your spirit, I mean, keep on developing h2o, even that I challenged your algorithm...

I want to introduce FastSocket https://github.com/fastos/fastsocket a kernel-space improvement for networking development, happy hacking and just for fun ;)

from h2o.

xiangzhai commented on June 18, 2024

so @kazuho failed to explain WHY h2o is faster 2x than Nginx https://github.com/kazuho/h2o/issues/14

from h2o.

kazuho commented on June 18, 2024

@wangbin579 Thank you for the suggestion.

I have updated the benchmark numbers to reflect the performance of the newest version of H2O, and nginx with tcp_nopush being set.

BTW，we found nginx outperforms h2o significantly when using webbench.

Thank you for taking the benchmark. I have noticed that H2O was not setting TCP_NDELAY and have updated the code just a couple of minutes ago in commit eef1612.

It would be great if you could take the benchmark again and check if the issue persists. If it does, it would be great if you could provide the parameters of the benchmark you are running.

from h2o.

xiangzhai commented on June 18, 2024

@kazuho I hold the view that if you develop a new I/O event notification better performance than epoll in kernel space, then h2o might be faster than Nginx for Linux box, but only I/O event optimization is not enough, if you have a look at Nginx‘s modules` source code ;)

from h2o.

kazuho commented on June 18, 2024

@xiangzhai Did you ever take a look at the link I provided in https://github.com/kazuho/h2o/issues/14#issuecomment-56323752 ?

from h2o.

wangbin579 commented on June 18, 2024

Nginx：

# ./nginx -v
nginx version: nginx/1.6.2

worker_processes 1;
master_process  off;
daemon off;
error_log /dev/stderr warn;
events {
  worker_connections 1024;
}
http {
  default_type      application/octet-stream;
  sendfile          on;
  tcp_nopush        on;
  keepalive_timeout 65;
  keepalive_requests 1000000;
  access_log off;
  server {
    listen 8070;
    location / {
      root html;
    }
  }
}

[html]# cat index.html
It works!

[webbench-1.5]# ./webbench -c 10 -t 60 http://xxx.xxx.xxx.xxx:8070/index.html
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.xxx:8070/index.html
10 clients, running 60 sec.

Speed=960826 pages/min, 3827290 bytes/sec.
Requests: 960826 susceed, 0 failed.

H2O：

port: 8090
files:
  /: examples/doc_root
request-timeout: 10
mime-types:
  txt: text/plain
  html: text/html
  gif: image/gif
  png: image/png
  jpg: image/jpeg
  jpeg: image/jpeg
  css: text/css
  js: application/javascript

[doc_root]# cat index.html
It works!

[webbench-1.5]# ./webbench -c 10 -t 60 http://xxx.xxx.xxx.xxx:8090/index.html
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.xxx:8090/index.html
10 clients, running 60 sec.

Speed=752419 pages/min, 2671087 bytes/sec.
Requests: 752419 susceed, 0 failed.

from h2o.

xiangzhai commented on June 18, 2024

@kazuho sorry, I missed your reply ;) https://news.ycombinator.com/item?id=8342684

Good, it is the correct way to compare the efficiency of different utilities, I will read your HTTP parser source code https://github.com/kazuho/picohttpparser/blob/master/picohttpparser.c then compare with Nginx to check out the time complexity O(n), indeed it is fair play ;)

from h2o.

kazuho commented on June 18, 2024

@wangbin579 Thank you for the log. It is likely that the concurrency you use for the benchmark (= 10) is too small. It depends on the performance of the benchmark PC, but I expect both nginx and h2o to perform faster than (960826 pages/min = 16014 pages/sec) if benchmarked correctly.

Would you mind change -c to 100 or so and try again?

from h2o.

wangbin579 commented on June 18, 2024

nginx:
[webbench-1.5]# ./webbench -c 500 -t 60 http://xxx.xxx.xxx.xxx:8070/index.html
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.xxx:8070/index.html
500 clients, running 60 sec.

Speed=1084465 pages/min, 4319785 bytes/sec.
Requests: 1084465 susceed, 0 failed.

h2o:
[webbench-1.5]# ./webbench -c 500 -t 60 http://xxx.xxx.xxx.xxx:8090/index.html
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.xxx:8090/index.html
500 clients, running 60 sec.

Speed=958200 pages/min, 3401716 bytes/sec.
Requests: 958200 susceed, 0 failed.

from h2o.

wangbin579 commented on June 18, 2024

We also notice that nginx consumes more cpu resources(vmstat's us) than h2o

vmstat:
h2o:
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 28764 162556 508400 2562564 0 0 0 8 2464 3743 4 40 56 0 0
2 0 28764 162680 508400 2562564 0 0 0 0 2446 3931 3 40 57 0 0

nginx:
r b swpd free buff cache si so bi bo in cs us sy id wa st
2 0 28764 143944 508400 2562564 0 0 0 0 2418 188 6 44 50 0 0
2 0 28764 143944 508400 2562564 0 0 0 0 2333 170 6 44 50 0 0

from h2o.

wangbin579 commented on June 18, 2024

[webbench-1.5]# uname -a
Linux zw1-12-193.inter.163.com 2.6.32-431.5.1.el6.x86_64 #1 SMP Fri Jan 10 14:46:43 EST 2014 x86_64 x86_64 x86_64 GNU/Linux

[webbench-1.5]# grep "model\ name" /proc/cpuinfo
model name : Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
model name : Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
model name : Intel(R) Xeon(R) CPU 5130 @ 2.00GHz
model name : Intel(R) Xeon(R) CPU 5130 @ 2.00GHz

[webbench-1.5]# free -m
total used free shared buffers cached
Mem: 7871 1413 6458 0 166 151
-/+ buffers/cache: 1096 6775
Swap: 8191 43 8148

from h2o.

kazuho commented on June 18, 2024

@wangbin579

We also notice that nginx consumes more cpu resources(vmstat's us) than h2o

That means that it is highly likely that your benchmark is failing to give enough load to the httpd. In order to take a correct benchmark, both nginx and H2O should use the same CPU resouces (since the benchmark is running the servers on a single core, they should ideally consume 100% of the single core).

There are couple of ways to achieve such goal.

One way is to pin webbench to a single core (it is a preforking benchmark tool and that means that unless being pinned, it would scramble to allocate CPU resources against the server which is being benchmarked).

But I would suggest switching to another benchmark tool, since using a preforking client is likely too heavy for benchmarking event-driven HTTP servers like nginx or H2O.

PS. I have tried running taskset 1 webbench -c 100 -t 15 on my testbed (taskset is the linux command to pin the processes to a single CPU core) but with such confirugration neither nignx nor H2O utilized 100% of the CPU core. That means that you would at least assign more than two CPU cores for webbench when you are going to benchmark nginx or H2O running on a single CPU core.

PS2. I had to assign three CPU cores to webbench to not let the http servers idle. And the results for serving 6-byte content on my testbed were as follows. I hope you can reproduce similar results.

# H2O
$ taskset 7 ./webbench -c 100 -t 15 http://127.0.0.1:7890/
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://127.0.0.1:7890/
100 clients, running 15 sec.

Speed=1820880 pages/min, 6312384 bytes/sec.
Requests: 455220 susceed, 0 failed.

# nginx
$ taskset 7 ./webbench -c 100 -t 15 http://127.0.0.1:8080/
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://127.0.0.1:8080/
100 clients, running 15 sec.

Speed=1300524 pages/min, 5072043 bytes/sec.
Requests: 325131 susceed, 0 failed.

from h2o.

xiangzhai commented on June 18, 2024

@kazuho I checked out your phr_parse_request https://github.com/kazuho/picohttpparser/blob/master/picohttpparser.c#L246 used by h2o, the phr_parse_request's time complexity shown as below:

is_complete O(n) to check whether nor not buf_start == buf_end
ADVANCE_TOKEN O(n) is it really better than strtok? called TWICE by parse_request
parse_int O(n) is it really better than Python's parse string to int algorithm in C?
parse_headers O(n^2)
get_token_to_eol O(n^2) called by parse_headers

It is NOT impressed, just so so, but if you developed in Assembly.

from h2o.

wangbin579 commented on June 18, 2024

Sorry, I have not mentioned that webbench runs on a different machine.

When running webbench via lo interface, h2o outperforms nginx a little, but this is not the common usage scenario.

nginx:
[190 webbench-1.5]$ ./webbench -c 500 -t 60 http://xxx.xxx.xxx.190:8070/index.html
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.190:8070/index.html
500 clients, running 60 sec.

Speed=888454 pages/min, 3539008 bytes/sec.
Requests: 888454 susceed, 0 failed.

h2o:
[190 webbench-1.5]$ ./webbench -c 500 -t 60 http://xxx.xxx.xxx.190:8090/index.html
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.190:8090/index.html
500 clients, running 60 sec.

Speed=986666 pages/min, 3501936 bytes/sec.
Requests: 986461 susceed, 205 failed.

If we use takset, the results are the following:

nginx:
[190 webbench-1.5]$ taskset 7 ./webbench -c 500 -t 60 http://xxx.xxx.xxx.190:8070/index.html 
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.190:8070/index.html
500 clients, running 60 sec.

Speed=998937 pages/min, 3979099 bytes/sec.
Requests: 998937 susceed, 0 failed.

h2o:
[190 webbench-1.5]$ taskset 7 ./webbench -c 500 -t 60 http://xxx.xxx.xxx.190:8090/index.html
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.190:8090/index.html
500 clients, running 60 sec.

Speed=1008222 pages/min, 3578822 bytes/sec.
Requests: 1008120 susceed, 102 failed.

Note that there are failed requests here.

If running webbench on a different machine(not a virtual machine), no matter how you optimize it, the results are similar to the following:

nginx:
[191 webbench-1.5]# taskset 7 ./webbench -c 500 -t 60 http://xxx.xxx.xxx.190:8070/index.html
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.190:8070/index.html
500 clients, running 60 sec.

Speed=1045694 pages/min, 4165339 bytes/sec.
Requests: 1045694 susceed, 0 failed.

h2o:
[191 webbench-1.5]# taskset 7 ./webbench -c 500 -t 60 http://xxx.xxx.xxx.190:8090/index.html
Webbench - Simple Web Benchmark 1.5
Copyright (c) Radim Kolar 1997-2004, GPL Open Source Software.

Benchmarking: GET http://xxx.xxx.xxx.190:8090/index.html
500 clients, running 60 sec.

Speed=895681 pages/min, 3179621 bytes/sec.
Requests: 895668 susceed, 13 failed.

from h2o.

wangbin579 commented on June 18, 2024

When using taskset, both h2o and nginx consume almost 100% cpu resources (%cpu)

from h2o.

wangbin579 commented on June 18, 2024

It should be noted that webbench uses short-lived connections while wrk uses long-lived connections.

When using wrk to stress nginx and h2o, we get the following results:

[190 html]# ll hessian-serialization.html 
-rw-r--r-- 1 root root 68814 Mar 24  2014 hessian-serialization.html

h2o:
[191 wrk]# ./wrk -c 100 -d 15 -t 1 http://xxx.xxx.xxx.190:8090/hessian-serialization.html
Running 15s test @ http://xxx.xxx.xxx.190:8090/hessian-serialization.html
  1 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    58.28ms   15.82ms  85.58ms   63.70%
    Req/Sec     1.72k    61.48     1.87k    66.00%
  25701 requests in 15.00s, 1.66GB read
Requests/sec:   1713.38
Transfer/sec:    113.03MB

nginx:
[191 wrk]# ./wrk -c 100 -d 15 -t 1 http://xxx.xxx.xxx.190:8070/hessian-serialization.html
Running 15s test @ http://xxx.xxx.xxx.190:8070/hessian-serialization.html
  1 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    58.15ms   15.61ms  85.30ms   63.85%
    Req/Sec     1.72k    41.09     1.81k    65.00%
  25696 requests in 15.00s, 1.66GB read
Requests/sec:   1713.00
Transfer/sec:    113.03MB


[190 html]# ll index.html
-rw-r--r-- 1 root root 10 Sep 22 14:16 index.html

h2o:
[191 wrk]# ./wrk -c 100 -d 15 -t 1 http://xxx.xxx.xxx.190:8090/index.html 
Running 15s test @ http://xxx.xxx.xxx.190:8090/index.html
  1 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.81ms   88.15us   5.13ms   71.17%
    Req/Sec    37.67k     1.96k   45.56k    56.11%
  528777 requests in 15.00s, 109.93MB read
Requests/sec:  35252.31
Transfer/sec:      7.33MB

nginx:
[191 wrk]# ./wrk -c 100 -d 15 -t 1 http://xxx.xxx.xxx.190:8070/index.html
Running 15s test @ http://xxx.xxx.xxx.190:8070/index.html
  1 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     3.33ms    1.80ms   7.28ms   58.30%
    Req/Sec    31.67k     1.69k   40.56k    59.42%
  445899 requests in 15.00s, 103.76MB read
Requests/sec:  29727.10
Transfer/sec:      6.92MB

BTW, in my opinion benchmarks can prove nothing to the engineers sometimes and production workload based comparison is more useful.

from h2o.

kazuho commented on June 18, 2024

@wangbin579

It should be noted that webbench uses short-lived connections while wrk uses long-lived connections.
When using wrk to stress nginx and h2o, we get the following results:

Ah! Thank you for mentioning that!

I hadn't noticed that webbench was not using persistent connections. H2O's way of calling of accept(2) is very primitive (one call to accept(2) per one event loop) and that explains why it's not performing as well as nginx when using webbench.

BTW, in my opinion benchmarks can prove nothing to the engineers sometimes and production workload based comparison is more useful.

I totally agree that production workload-based comparison is more useful, although I would like to point out the fact that the types of HTTP requests that matters for an ordinary HTTP server in terms of performance are GET requests against small-to-medium-sized asset files over persistent connections since those are the ones that consist most of the requests sent to a typical web server.

Thank you anyways for your help in improving the benchmark and for analyzing the characteristics of the benchmarks. I really appreciate it.

from h2o.

kazuho commented on June 18, 2024

@xiangzhai Your analysis is totally incorrect.

I suggest you once again to actually perform dynamic analysis (e.g. profile the code) instead of trying to persuade somebody with your static analysis. The number of mistakes you have made in analyzing picohttpparser is a clear sign that your approach is failing. I cannot expect a correct analysis on nginx or H2O from the same approach that caused so many mistakes in analyzing picohttpparser (which is < 1,000 LOC).

I expect to see more productive comments from you in the future.

FYI, below are some of the mistakes found in your analysis.

the phr_parse_request's time complexity shown as below:
1 is_complete O(n) to check whether nor not buf_start == buf_end

Wrong. is_complete is never called in most of the cases since HTTP requests typically arrive as a single packet. The fact would have been clear if you had applied dynamic analysis, or had considered the meaning of the comment right before the call to the function.

2 ADVANCE_TOKEN O(n) is it really better than strtok? called TWICE by parse_request

You should never ever apply a function that take a C string as the argument against a chunk of data that is not terminated. This is one of the very basic thing in secure C coding.

4 parse_headers O(n^2)
5 get_token_to_eol O(n^2) called by parse_headers

Wrong. They are O(N) under the assumption that N is the length of the request (generally speaking, you should better clarify what N is before using the O() notation).

from h2o.

xiangzhai commented on June 18, 2024

@kazuho please patched the time-complexity-test.patch https://github.com/xiangzhai/picohttpparser/blob/master/time-complexity-test.patch to check out the time elpased called your HTTP parser about request stuff 10K times.

I hold the view that your HTTP parser is O(n^2) just like VERY BASIC one http://leetcode.com/2011/11/longest-palindromic-substring-part-i.html NOT black art O(n) http://leetcode.com/2011/11/longest-palindromic-substring-part-ii.html

I cannot expect a correct analysis on nginx or H2O from the same approach that caused so many mistakes in analyzing picohttpparser (< 1000 LOC)

YUP, it is the fastest algorithm if you developed NOTHING ;) https://github.com/xiangzhai/picohttpparser/blob/master/test_print.c

Perhaps h2o would be faster 2x than Nginx after you experienced all of the leetcode questions http://leetcode.com/

from h2o.

kazuho commented on June 18, 2024

@xiangzhai

please patched the time-complexity-test.patch https://github.com/xiangzhai/picohttpparser/blob/master/time-complexity-test.patch to check out the time elpased called your HTTP parser about request stuff 10K times.

What are you trying to compare with the benchmark? If you think that picohttpparser is O(N^2) against the length of the HTTP input, then you should compare the times consumed for different lengths of inputs. Parsing a single request 10,240 times does not mean anything.

I hold the view that your HTTP parser is O(n^2) just like VERY BASIC one http://leetcode.com/2011/11/longest-palindromic-substring-part-i.html NOT black art O(n) http://leetcode.com/2011/11/longest-palindromic-substring-part-ii.html

Do you really think that the below function is O(N^2)? It does have a nested loop, but it is clearly O(N). If you cannot understand, you should really go back to your text book on computer science.

static const char* get_token_to_eol(const char* buf, const char* buf_end,
                    const char** token, size_t* token_len,
                    int* ret)
{
  const char* token_start = buf;

  while (1) {
    if (likely(buf_end - buf >= 16)) {
      unsigned i;
      for (i = 0; i < 16; i++, ++buf) {
        if (unlikely((unsigned char)*buf <= '\015')
            && (*buf == '\015' || *buf == '\012')) {
          goto EOL_FOUND;
        }
      }
    } else {
      for (; ; ++buf) {
        CHECK_EOF();
        if (unlikely((unsigned char)*buf <= '\015')
            && (*buf == '\015' || *buf == '\012')) {
          goto EOL_FOUND;
        }
      }
    }
  }
 EOL_FOUND:
  if (*buf == '\015') {
    ++buf;
    EXPECT_CHAR('\012');
    *token_len = buf - 2 - token_start;
  } else { /* should be: *buf == '\012' */
    *token_len = buf - token_start;
    ++buf;
  }
  *token = token_start;

  return buf;
}

YUP, it is the fastest algorithm if you developed NOTHING ;) https://github.com/xiangzhai/picohttpparser/blob/master/test_print.c

Thank you for your vandalism. In my last comment, I said I expect you to provide more productive comments. Should I better give up and ignore you?

from h2o.

xiangzhai commented on June 18, 2024

@kazuho

I do not have time to write testcase of heavy pressure and boundary test for your copy/paste open source project.
get_token_to_eol`s time complexity is O(n), my mistake, sorry
just ignore me ;) because I argue that your stuff is not better than Nginx

from h2o.

kazuho commented on June 18, 2024

@xiangzhai

1 I do not have time to write testcase of heavy pressure and boundary test for your copy/paste open source project.

Should I take your comment "copy/paste open source project" as a personal insult?

2 get_token_to_eol`s time complexity is O(n), my mistake, sorry

It is glad that you have acknowledged one of the mistakes you have made in the analysis. As previously stated, there are other mistakes as well; I recommend you to look back all of your analysis.

3 just ignore me ;) because I argue that your stuff is not better than Nginx

This place exists to discuss the problems and/or improvements related to H2O (and improvements to the benchmark on the README fit into the criteria). But your comments do not seem to fit in.
From now on, I will remove your comments without warning unless it is productive.

from h2o.

mingodad commented on June 18, 2024

Nginx logs all requests but h2o do not, that is not fair comparison.

from h2o.

kazuho commented on June 18, 2024

@mingodad
No. As commented on https://github.com/kazuho/h2o/issues/14#issuecomment-57406533 both nginx and h2o are run with their access logs disabled. The configuration file can be found on the link of the README.

from h2o.

kazuho commented on June 18, 2024

My slides at http://blog.kazuhooku.com/2014/11/the-internals-h2o-or-how-to-write-fast.html covers new benchmark results (including access from remote), addresses some of the implementation differences bet. nginx (with the understanding that http-parser is a fork of nginx's HTTP parser).

from h2o.

kazuho commented on June 18, 2024

@xiangzhai 👍

from h2o.

centminmod commented on June 18, 2024

Just came across H20 and interesting project might integrate it into my Centmin Mod LEMP web stack as I have plans to add other web servers Apache 2.4 and OpenLiteSpeed to existing Nginx web stack so H20 seems like simple enough to integrate too :)

For benchmarks also check out other tools

siege http://wordpress7.centminmod.com/122/wordpress-super-cache-benchmark-siege-http-load-testing/
locust.io http://wordpress7.centminmod.com/132/wordpress-super-cache-benchmark-locust-io-load-testing/ - official site at http://locust.io/ and github https://github.com/locustio/locust
blitz.io http://wordpress7.centminmod.com/74/wordpress-super-cache-benchmarks-blitz-io-load-test-237-million-hitsday/

from h2o.

Misleading benchmark about h2o HOT 27 OPEN

Comments (27)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent