phpDocumentor, an update

In the past month so much has happened! So incredibly much has happened but none of it shows on Github unfortunately. You might be thinking that those phpDocumentor people are slacking while there is a crapton of Github issues still there to be worked on; but the truth is that lots of stuff is currently going on.

As for me personally? I have been to two conferences and a tech event (where I coached), decided to decimate my growing mailbox (including some messages from people with phpDocumentor questions, again my apologies for the belated answers), teaching PHP to my wife, and recently a friend of ours wanted an introduction into development as well to see if it is a good career move. In addition I try to help another two awesome persons on raising their skills, and they contribute to phpDocumentor in return. Pure win!

You get it. I have been busy with all kinds of awesome. But I promised you that aside from all this there are excellent things happening with phpDocumentor as well. Here goes.

Documentation Standards

One of our oldest feature requests is #40. In that feature requests people are asking for a way to configure how phpDocumentor validates your in-source documentation. Because this feature request has been so old I had decided to deliver this item. I saw that merely adding a toggle to a configuration file might work but I wanted to take it one step further: Documentation Standards.

Once this feature gets in it will be possible to create a Documentation Standard for your project where you can configure which violations to check for. We will follow the same ruleset format that phpCodeSniffer uses so that you do not have to get used to a new format and deliver a strict default standards.

Search

In the early days of phpDocumentor version 2 we had a search facility but felt that we had to remove it because it did not perform at all and caused annoying delays when reading the documentation. This has been the case ever since issue #636 is open and I have decided that enough is enough. This has to be fixed! So currently we are working on a search system where you can use a client-side full text search solution or switch out an adapter and push your search data into Elastic Search; making it incredibly faster.

Performance

A few weeks ago I read a tweet by Ben Ramsey where he talked about the performance of phpDocumentor and indicated to us that the project he was working on took 2 hours to be documented. Now I know that phpDocumentor is not the fastest of the nextgen generators (because it is the most flexible and feature complete one) but this is absurd and completely unacceptable!

This tweet pushed me to give priority to a few initiatives that had been waiting on a shelf because I was not aware of the urgency of the situation. One of these is a platform to continuously measure performance of changes and alert us when it reaches a critical threshold. I am still working on this component but currently we have XHGUI and XHProf working together on gathering statistics for us.

Two more changes are architectural in nature. One of them is to push the Reflection component to a version 2 and actually create Descriptor objects directly from the interpreted source files.

For those of you who don’t know: phpDocumentor parses a source file into an Abstract Syntax Tree (AST) using PHP-Parser, then transforms it into a series of objects that resemble PHP’s Reflection objects and finally transforms that into a series of light-weight objects representing the structures in the code called Descriptors. With this change we cut out the Reflectors and directly create Descriptors.

The other architectural change is that I have been writing a Lexer for the PHPDoc language in an attempt to improve the parsing efficiency of DocBlocks so that we can win some more speed in that department as well. To be honest, I do not know yet if and how much speed I can win there but this is also a proof of concept where we can more easily try out new features that are proposed in the PSR-5 Draft.

The demo feels promising however. Current tests show that a complete DocBlock (summary, description, 2 annotations and 3 tags) is lexed in approximately 500 to 600 microseconds (meaning I can lex 10.000 large-ish DocBlocks in 5 to 6 seconds). With some tweaking we can probably improve that number but the parser will still take some time, and thus will reduce that number again. We will have to see how it works out once it is finished by doing practical tests.

Templates and performance

The biggest performance gain must come from the templates. We have noticed that the current Clean template performs awfully and this is something I am quite sad about. I have noticed a trend in the past versions that Twig seems to be considerably slower compared to the original XSL-based templates. This bums me out because Twig-based templates are so much easier to read, to write and to install because they do not need an additional extension.

I am still chewing on this specific issue; it is clear to me that at least the Clean template should be fixed as it is quite performance hungry. But aside from that I plan on building and optimizing a series of base blocks that are optimized for performance that can be used to assemble custom templates and to serve as a basis for our own templates.

In addition to the above I am still looking for a simple templating engine that has absolutely high performance and no requirements on external applications or extensions because that would make phpDocumentor harder to install. I am thinking of just using PHP as a templating language, PHP Mustache or even just going back to XSL. In the end I believe performance matters a great deal.

Conclusion

There you have it. phpDocumentor has a lot of goodness ahead of it and this is not all. The roadmap is overflowing with awesome stuff that I would want to see in phpDocumentor. However, due to the amount of work involved we are unable to meet our release schedule for now, I hope that you understand.

Do you want to help out or do you have a good suggestion for a templating engine? Do you have questions? Contact me! I can’t wait to hear from you.