HHVM is not for WordPress

So, I stopped using HHVM and fell back to ye-olde php5-fpm because HHVM got too good at what it is doing.

HHVM itself is beautiful and I like it for the administration simplicity and throughput benefits it offers. The JIT is performing better and better by each new release. This is absolutely brilliant if you’re writing new, clean, strictly typed and simple PHP code. Some of the engineering tradeoffs to make this work, mostly the translation cache size limits, will mean badly written, as in traditional, PHP code will eventually fill up the cache and HHVM will exit with an assertion.

So far sounds good and sensible, so what’s the issue? Even WordPress is getting cleaner and better written by the release and performing better and better on HHVM. Plugins are not.

Running WordPress without also having to run arbitrary plugins is not a realistic deployment scenario. Especially since the sources of the worst kinds of typing polymorphism for the HHVM JIT seem to be the caching plugins and in general anything which inserts itself deep and early into the request response context like any application firewall plugins.

In my experience, out of the box on a normal WordPress installation, it will currently assert 3..5 times per day on a newer HHVM. The older versions at first did not assert at all, then they started to go at a rate of a few times a week and now we’re already up to a few times a day.

How about just raising the limits? I tried setting the limits up to the next power of two every time I had an assertion. This carries a startup time penalty and makes HHVM, quite understandably, use more memory at runtime. I also found out, that at very high levels, after an assert, HHVM will not start again until you nuke out the hhvm.hhbc SQLite repository. I quite agree with the documentation on the “extremely cryptic errors” part.

What about the ugly way around, simply just kicking the process back up all the time? That can work for you. During the startup periods clients will receive 502s. This is a service level design consideration.

I’ve rather opted to go back to a more traditional way. PHP 7 will land soon enough and bring with it comparable performance gains.

Debian Jessie Nginx HTTP/2

Nginx 1.9.10 just landed into jessie-backports. How do I capitalize?

In order to currently benefit from HTTP/2 you will need to have the willingness to have backports enabled in your apt sources and you also need to be serving traffic over TLS.

Also be aware there are install images with firmware already on the disk these days too if you need that. And you can use a menu to select your desktop environment instead of kernel boot parametres too – Debian is getting quite convenient these days.

I thoroughly recommend selecting the most excellent httpredir.debian.org service for your mirrors – it is nicely enough available as a menu-selectable option under all mirror countries in debian-installer these days too so you do not simply have to know about it and input it manually anymore.

# Debian Jessie main repositories
deb http://httpredir.debian.org/debian jessie main contrib non-free
deb-src http://httpredir.debian.org/debian jessie main contrib non-free

# Debian Jessie updates
deb http://httpredir.debian.org/debian/ jessie-updates main contrib non-free
deb-src http://httpredir.debian.org/debian/ jessie-updates main contrib non-free

# Debian Jessie backports
deb http://httpredir.debian.org/debian/ jessie-backports main contrib non-free
deb-src http://httpredir.debian.org/debian/ jessie-backports main contrib non-free

# Debian Jessie security repositories
deb http://security.debian.org/ jessie/updates main contrib non-free
deb-src http://security.debian.org/ jessie/updates main contrib non-free
apt-get update
apt-get -t jessie-backports install nginx-full

Do note the nginx-light package does not provide the HTTP/2 module whereas nginx-full and nginx-extras do!

So, how to TLS? https://letsencrypt.org/

I’m a slight traditionalist in how to admin systems, so I’ve shunned away from their official full stack monolith automation client in favour of simp_le (developed by one of the key people behind the main project).

git clone https://github.com/zenhack/simp_le.git
cd simp_le

I like to use the webroot challenge response domain validation method they offer. For this I have a /var/www/letsencrypt for simp_le to write the challenge responses to and a reusable and includable configuration snippet for Nginx to allow the ACME servers to verify the successful completion of the challenge-response authentication.

location ^~ /.well-known {
 allow all;
 alias /var/www/letsencrypt/.well-known;

I’m not fully certain why I need to do regex matching for the location in order to be able to explicitly allow access to /.well-known within otherwise denied locations, but this was the only way I found to make it work universally anywhere (Nginx versions 1.4 to 1.9 at the time of writing). Please let me know in the comments if you have a deeper understanding of Nginx here.

server {
  listen 80 default_server;
  listen [::]:80 default_server;

  server_name _;
  include letsencrypt-webroot.conf;

  location / {
    return 444;

You will have to authenticate your domain to Letsencrypt and they only provide DV certs.

If you need OV or EV certs, you still need to pony up some cash as those require human effort from the part of the CA.

include cors-map.conf;

server {
	listen 80;
	listen [::]:80;

	server_name example.org www.example.org static.example.org;

	access_log /var/log/nginx/example-access.log;
	error_log /var/log/nginx/example-error.log;

	include cors-headers.conf;

	return 301 https://$server_name$request_uri;

server {
	listen   443 ssl http2;
	listen   [::]:443 ssl http2;

	server_name example.org www.example.org static.example.org;

	access_log /var/log/nginx/example-access.log;
	error_log /var/log/nginx/example-error.log;

	include ssl-example.conf;

	root /var/www/example.org;
	index index.html index.htm;

	charset utf-8;

	include letsencrypt-webroot.conf;

	include cors-headers.conf;

	location / {
		try_files $uri $uri/ /index.html;

This will diverge into a neat split of breadcrumbtraily shared bits of config to keep each piece neat, human readable and easy to edit when necessary.

The HTTP/2 bit here is really as simple as just slapping http2 after ssl on the listen directives.

I also do CORS handling here in the same fashion. Let’s get that out of the way first.

map $http_origin $allow_origin {
  default '';
  include cors.map;

Here I define a map, which per default would set an empty string as the value for the variable $allow_origin, but upon hostname matches from $http_origin will return the matching hostname instead.

Do note the map needs to be defined outside a server block. This is why the include for this is in the beginning of the file.

example.org $http_origin;
*.example.org $http_origin;

Do take note how neat the map hostnames directive makes this list of allowed domains vs. some self baked regex you might think of cooking up. It is also very battle tested, performant and robust in comparison.

add_header Access-Control-Allow-Origin $allow_origin;
add_header Access-Control-Allow-Credentials 'true';

And finally we just slap the headers onto responses, where appropriate. $allow_origin gets resolved per request to an empty string or a whitelisted domain match. The map mechanism is something more people should be using instead of the infamous if statement. Nginx has two separate resources pointing this out in their excellent documentation.

ssl_certificate /etc/letsencrypt/live/example.org/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/example.org/privkey.pem;
ssl_trusted_certificate /etc/letsencrypt/live/example.org/chain.pem;
ssl_dhparam /etc/letsencrypt/live/example.org/dhparam.pem;

include ssl-common.conf;

So, finally, Nginx SSL configuration.

How I’ve automated simp_le adheres to the output paths of the official Letsencrypt client. You may implement this yourself however you please – all roads lead to Rome so long as you keep your private key private!

Do note how neat and readable this per host includable certificate configuration is. And all the TLS settings for all of your hosts will be in a single clean shared file!

ssl_session_cache shared:SSL:10m;
ssl_session_timeout 1d;
ssl_session_tickets off;
ssl_stapling on;
ssl_stapling_verify on;

ssl_protocols TLSv1.1 TLSv1.2;
ssl_prefer_server_ciphers on;

# HSTS disabled due to it being able to paint oneself in a corner, A is good enough from ssllabs.com
#add_header Strict-Transport-Security "max-age=15768000; includeSubdomains; preload";
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;

resolver valid=300s;
resolver_timeout 5s;

I’m personally OK using the Google DNS here. Your organization might not be – do apply common sense here.

When adjusting and/or balancing security vs. reachability, I usually refer to the Mozilla SSL config generator and cipherli.st.

It also does not hurt to take a peek at ssllabs.com, but playing golf with their scores always comes with a cost (performance or reachability). I aim for ‘minimum to attain A’ as I’ve found that the sweet spot balance for my tastes. Do check a few times per year as the world keeps on changing here.

Do note that Android does not really support modern TLS all too far into the past and old versions are still widely in use out there. So if you’re known to have many end users from smaller international markets or just want to be extra inclusive, you will have to make an informed compromise here between security and reachability.

On a final note on Nginx SSL settings: HPKP and HSTS seem to be all the rage these days to enable just because you’re caught up in the game of ‘maximizing your security’. These are both engineering tools with rules and assumptions with tradeoffs built in. Do not wield these tools to please an arbitrary internet genie without understanding how exactly is it that you can paint yourself into a corner with these two.

openssl dhparam -out dhparam.pem 4096

Don’t forget to try your patience with generating yourself fresh Diffie-Helman parametres too!

2048 is quite fine too for now (also as a key size!), when out of patience. The art of large prime factorization is advancing scarily fast, though.

/path/to/simp_le/venv/bin/simp_le \
--email admin@example.org \
--default_root /var/www/letsencrypt \
-f account_key.json \
-f fullchain.pem \
-f cert.pem \
-f chain.pem \
-f key.pem \
-d example.org \
-d www.example.org \
-d static.example.org \
-d mail.example.org

The simp_le client can be run as an unprivileged user as long as that user can write to where ever you try to write your account data and certificate data to and for the authentication it also needs to be able to write to /var/www/letsencrypt.

4096bit keys are the default and this tool will also rotate your private keys for you. This is one reason why they only issue out certs maximally 90d in validity.

So yay, now everything works for everyone? Sorta.

ALPN would require openssl 1.0.2 and jessie / jessie-backports only have 1.0.1 as of writing. Stretch has 1.0.2 already so it’s a matter of time until jessie-backports will also have a suitable openssl version available.

NPN still works for negotiating the protocol, the cost of an extra header exchange is not too tragic when APLN will start working without configuration changes with a simple drop-in package dist-upgrade from backports later on.

So what does one actually gain from all this?

What’s next for HTTPS via Nginx? Hopefully Brotli.

HHVM 3.9.0 on Centos 7.1

So, I started a blog.

This is hosted on HHVM. The setup turned into a minor adventure. Business as usual on the bleeding edge.

Starting notes

  • HHVM does not have an official yum repo
  • The .rpm is still quite work in progress
    • They are working on it, or at least their TODO is extensive
      • What outright failed out of the box for me:
        • It did not create /var/lib/hhvm/
        • It did not create /var/log/hhvm/
        • It did not create /etc/tmpfiles.d/hhvm.conf
        • It did not create the group hhvm
        • It did not create the user hhvm
      • What could be better out of the box for me:
        • It only listens on TCP per default

Fixing from the bottom up

I simply created the user and the group in the way the package is supposed to. Notice here how it’s using /var/lib/hhvm as the home directory for the user. This was a very good hint later on.

groupadd -r hhvm
useradd -r -g hhvm -d /var/lib/hhvm -s /sbin/nologin -c "HHVM" hhvm

Unix sockets do have a reputation of being faster for within-system IPC, so tackling that came next.

pid = /run/hhvm/hhvm.pid

hhvm.server.file_socket = /run/hhvm/hhvm.sock
hhvm.server.type = fastcgi
hhvm.server.default_document = index.php
hhvm.source_root = /var/www

hhvm.log.use_log_file = true
hhvm.log.file = /var/log/hhvm/error.log

At this point, the daemon did not start as it cannot write to its socket. A bit of further housekeeping was required:

  • The log directory needs to exist and be writable
    • This one is not a fatal problem, but should be addressed if logging is desired
  • The run directory needs to exist and be writable
  • The run directory needs to persist across reboots
    • Option 1 – modify the systemd hhvm.service file
      • You have to upkeep it vs. future upstream changes in hhvm.service
      • ExecStartPre hackery is usually considered ugly these days
    • Option 2 – use /etc/tmpfiles.d/
      • This is a freedesktop.org standard for solving the problem scope
      • Check man tmpfiles.d if in doubt about the syntax
Description=HipHop Virtual Machine (FCGI)

ExecStartPre=/bin/mkdir -p /run/hhvm 
ExecStartPre=/bin/chown hhvm:hhvm /run/hhvm
ExecStart=/usr/bin/hhvm -c /etc/hhvm/server.ini -c /etc/hhvm/php.ini --user=hhvm --mode daemon [Install] WantedBy=multi-user.target 
d /run/hhvm 0755 hhvm hhvm -

Now the daemon started, but my webfront could not read its socket. This was where it would have been nice to be able to set the socket user and group, but alas, HHVM does not yet support that. There was already work towards that, but a large changeset into the whole FastCGI implementation got (quite understandably) the priority of way for being pulled in and no one ever got around to rebasing the configuration work.

I quite inelegantly solved this for now by simply letting hhvm run as the webfront user.

Also, HHVM caches code to disk, so after tinkering around a while I had multiple .hhvm.hhbc files littered around my system and was scratching my head a bit about why. It defaults to ~/.hhvm.hhbc. This is where the hint I mentioned earlier came in handy on figuring out what is up with that.

Initial impressions

  • It likes to use a lot of memory, but it does so in a single process
    • Quite nicely easy to track at a glance vs. older worker process models
    • Out of the box it uses the same as ~10 workers would use
      • I would not use so many workers for this deployment myself
      • It’ll be quite interesting to see how this lives over time and different loads
  • There is no way to meaningfully limit how much memory it can use
    • This blog will most likely outright crash my server in case of any serious traffic
    • The ugly hack would be to use systemd resource controls
      • Restart the service when a certain memory limit is hit
    • The real solution would be to have an infrastructure where you have dedicated HHVM boxes
  • It pretty much does what it says on the tin
    • It already supports most of what one would imagine to do with it