melp.nl

< Return to main page

Composer local package mirroring: Press the pedal to the metal.

In my previous blog post, I told you about hosting local package repositories for composer. Me, or to be honest, my colleagues, weren't too excited about the performance gains. So I decided to dig in a bit deeper.

A sidenote on the examples

php *.phar makes me wanna curl up in a corner and quit my job. That's because I don't like typing that stuff. We're in a UNIX world where we control our own destiny and we can decide whatever the hell we want to call our binaries or scripts. What I always do for scripts like this is simply making the .phar files executable and symlinking them in /usr/local/bin. So that's why you'll see composer and satis in my examples rather than php composer.phar or php satis.phar.

This is just a sidenote, if you still want to use php whatever.phar, you are of course very welcome to do so ;)

Pitfall avoided

Before I tell you what you need to do, there is one little bugger you need to be aware of. Satis uses your composer config (in the COMPOSER_HOME) to resolve versions as well. As I wasn't aware of that, and my composer cache was tainted with all kinds of repositories, I didn't realize that satis would take this data in account when writing out the packages.json file. It makes sense that satis would use Composer libs for resolving this, but also reading the composer cache is one bridge too far, if you ask me.

But as long as you're not asking me, I wont elaborate ;). I fixed this by having my build scripts for the satis package repository expose a different COMPOSER_HOME to composer:

#!shell
COMPOSER_HOME=./.composer satis build satis.json .

This way, satis will try to read data from the specified directory, which will ignore any configuration from your regular composer config. It won't be surprising that the config file should exclude packagist.org, which the author thought of already, but what might be surprising is that you shouldn't use any other repository at all.

In my testing environment, I had a local repository enabled, which was filled with data from packagist. Why? Because I built it with dependencies referencing packagist, and these references prove to be quite contagious. Long story short, just put the following in your config and let your COMPOSER_HOME point there to avoid weirdness.

#!javascript
{ "repositories": [ { "packagist": false } ] }

In other words: isolate the environment building the satis repository from the one testing it and you'll be fine.

Part 1: Generating the source packages

To mirror a github repository, all you really need is ssh, git and some place you can give your coworkers access to the same ssh account. We have a setup at Zicht where we have access to a local development server where all employees can log in using SSH. In this example, I will call this machine springfield and the user accessing it homer, hence homer@springfield.

Create a list of github repositories you need

Github is our starting point. Most packages (if not all) you will need probably are there and have a composer.json file in it, so Satis will be able to generate a packages.json for it. Here's an example of some repository names you might use.

symfony/symfony
doctrine/common
fabpot/Twig
fabpot/Twig-extensions
php-fig/log
... etc 

Save this in a file called packages.list. A more extensive list you would need for any Symfony project is mentioned at the end of this post, that will save you some time figuring out the dependencies.

Mirror the repositories

Now, based on these github repository names, we can start mirroring stuff. There are basically two ways you can go here. Either you remove all local mirrors of the packages and clone them again, or fetch all branches for each of the previously mirrored repositories. The most practical would be a script that would do the latter if the mirror exists, but the former if not.

Login at homer@springfield and cd to the path you will have this packages.list created and you will host your satis repository from later on. Let's say this is at homer@springfield:~/satis.

Then execute the following piece of code:

#!bash

# fetches all repositories that are not cloned before:
(
    mkdir -p packages;
    cd packages; 
    for r in $(cat ../packages.list); do
        if ! [ -d $r.git ]; then 
            mkdir -p $(basename $r) && git clone --mirror --bare https://github.com/$r.git $r.git; 
        fi;
    done; 
)

A slightly modified version of the script will update the clones:

#!bash
( 
    cd packages; 
    for r in $(cat ../packages.list); do               
        if [ -d $r.git ]; then 
            ( cd $r.git && git fetch --prune );
        fi;
    done; 
)

Generate a satis.json

Satis will need the repository URL's on the machine to download the composer.json files from it and generate a packages.json. With the following php script, the list of packages is converted into a list of repository URL's understandable by Satis. Since the satis repository must be accessible via HTTP as well, I am assuming the directory at homer@springfield:~/satis is accessible at the following URL: http://springfield/~homer/satis.

#!php
<?php
# satis.json.php

$mirror =  'homer@springfield:~/satis/packages';

$repos = array();
$i = 0;
foreach (array_filter(array_map('trim', file('php://stdin'))) as $package) {
    $repos[]= array(
        'type' => 'vcs',
        'url' => sprintf('%s/%s.git', $mirror, $package)
    );
}
?>
{
    "name": "Github Satis Mirror",
    "url": "http://springfield/~homer/satis",
    "homepage": "http://springfield/~homer/satis",
    "repositories": <?php echo json_encode($repos); ?>
}

Run the script as follows:

#!shell
# homer@springfield:~/satis
php satis.json.php < ./packages.list > ./satis.json

Generate the packages.json file

Now, render the packages.json file

#!shell
# homer@springfield:~/satis
satis build ./satis.json . 

This will generate the packages.json and an index.html which you should now be able to access through http://springfield/~homer/satis/. Assuming your http configuration is there. And you have the same setup... Well, you get the idea ;)

Part 2: Generating dist packages

When you use dist packages, composer will cache the downloads in your local composer cache. This makes the use of stable (tagged) version specs combined with a --prefer-dist the most effective and performant way to download and include packages in your project. Another advantage is that you won't have the .git meta folders in your vendor dir which simply saves disk space.

To do this, we can use the git archive utility. For each of the downloaded packages, we'll find out what the available tags are, and generate zip archives for it. I had trouble using tar which caused troubles in composer which utilizes Phar for extracting the tar files. It doesn't really make sense to me that Phar is used for this (UNIX principle, anyone...?) but that's another story.

Generate .zip archives for all locally mirrored git repositories

#!bash
( 
    cd packages;
    for d in */*.git; do
        ( 
            cd $d; 
            for t in $(git tag -l); do
                if ! [ -f $t.zip ]; then
                    echo "Building $d/$t.zip"
                    git archive $t^{tree} -o $t.zip;
                fi;
            done;
        );
    done; 
);

As you can see, this little snippet walks through all of the previously downloaded packages and generates a .zip archive for each of the tags that are present in the git clone. If the package file already exists, it is skipped, so the snippet is incremental, just like the mirror scripts above.

Adding the dist references to packages.json

We generated a packages.json file before which contains references to all of the source packages at homer@springfield:~/satis/packages/[user]/[name].git. To add the archives to each of the versions in the packages.json, again we use a simple php script handling this:

#!php
<?php
# add-dist.php 

$rootUrl = 'homer@springfield:~/satis/packages';
$publicUrl = 'http://springfield/~homer/satis/packages/';

$type = $_SERVER['argv'][1];

$packages = json_decode(file_get_contents('php://stdin'), true);

foreach ($packages['packages'] as $name => $versions) {
    foreach ($versions as $versionId => $spec) {
        $packagePath = str_replace($rootUrl, '', $spec['source']['url']);

        $distFile = $packagePath . '/' . $versionId . '.' . $type;

        if (is_file('packages/' . $distFile)) {
            in_array('-v', $_SERVER['argv']) && fwrite(STDERR, "Found $distFile\n");
            $packages['packages'][$name][$versionId]['dist'] = array(
                'type' => $type,
                'url' => $publicUrl . $distFile
            );
        }
    }
}
echo json_encode($packages);

The script will read the input as packages.json file, walk through all of the versions, and check if an archive is available in the packages directory. Again, run the script from the ~satis directory, like this:

#!shell
# in homer@springfield:~/satis
php add-dist.php zip < ./packages.json > ./packages-with-dist.json
mv packages-with-dist.json packages.json

The script is incremental again, so you can repeat this as many times as you like without needing to rewrite the original packages.json again.

Your repository is now ready for use. But you should note the following section before you start using it.

Part 3: Getting your local config to play nicely

You should eradicate all github references from your composer.lock files. Since composer uses a shared cache for all of your projects, your cache will get tainted with github references from your composer.lock files, any time you do a composer install. This may cause any composer.lock from another project to influence the repository URL's used in any other project, as long as they share the same cache. So to make sure your local package repositories are used, some blunt force is required.

Exclude packagist from the default config

Don't use packagist any more. If you do use it, all effort was in vain. Configure your local config.json as follows:

#!javascript
{
    "repositories": [
        { "packagist": false },
        { "type": "repository", "url": "http://springfield/~homer/satis/" }
    ]
}

Remove all github references. Rinse. Repeat.

While you are exorcising the demons from your composer.lock file, you should keep removing the cache you are using. By default, this is in your home dir at ~/.composer/cache. Also, you need to update your composer.lock file, which may need some finehand tweaking of the version specs you're using in your project. You can use high verbosity of composer to detect any use of github. If it uses github, you're either missing a github reference in your packages.list, or you need to update that packages.

Here's in pseudo code what you need to do.

while either (
        my composer.lock file contains github references
    OR  my cache contains github references
    OR  composer tells me it wants to try to download something from api.github.com
) {
    I shall:
        remove my composer cache entirely by executing 'rm -rf ~/.config/composer/cache'
        entirely remove the vendor dir by executing 'rm -rf vendor'
        verify that all my packages mentioned in composer.json are available 
             at the `springfield` server
        verify that the composer.json contains only packages that are available 
             at the `springfield` server
        use 'composer update -vvv' to verify what composer wants to download
}

Use the shell to verify that everything's cool:

#!shell
grep '"url".*github.com' ./composer.lock    # should return nothing
composer install -vvv | grep 'github'       # should also return nothing

Remove your composer.lock file and start from scratch if you can't get it to work.

Prefer dist packages, always. Unless you need source. Duh.

Add the following section to your ~/.composer/config.json to prefer dist packages.

#!javascript
{
    "config": {
        "preferred-install": "dist"
    }
}

The end result

Here's a little script to test the performance gain.

#!bash

rm -rf ~/tmp/time-composer && mkdir -p ~/tmp/time-composer && cd $_;
mkdir regular-config local-config

# This should be the config.json you're about to use
cp ~/.composer/config.json local-config

# This is the same one, in my case already containing a github OAuth token,
# but without the "repositories" section.
php -r'$o = json_decode(file_get_contents("php://stdin")); unset($o->repositories); echo json_encode($o);' \
     < ~/.composer/config.json \
     > ./regular-config/config.json

echo "Using out of the box config:"
rm -rf ./project && mkdir project && cd $_;
time COMPOSER_HOME=../regular-config/ composer require "symfony/symfony:2.3.*@stable" --prefer-dist
du -sh .
cd ..

echo "Using local config:"
rm -rf ./project && mkdir project && cd $_;
time COMPOSER_HOME=../local-config/ composer require "symfony/symfony:2.3.*@stable" --prefer-dist
du -sh .
cd ..

I just ran this on the same box the mirror repositories are on, and the result is as follows (excluding all the output of composer itself):

#!shell
Using out of the box config:

real    0m26.529s
user    0m2.896s
sys 0m0.424s
36M .
Using local config:

real    0m4.653s
user    0m4.660s
sys 0m0.352s
36M .

Since most coworkers are actually on that same server working on their projects, it's representative for our case. But even when using the same settings on a machine in the same network will show similar results. That's worth the trouble.

The packages.list

Here's the list I use currently. For any average Symfony project this will probably be enough, but you can amend it to your needs, obviously.

doctrine/annotations
doctrine/cache
doctrine/collections
doctrine/common
doctrine/data-fixtures
doctrine/dbal
doctrine/doctrine2
doctrine/DoctrineBundle
doctrine/DoctrineFixturesBundle
doctrine/inflector
doctrine/lexer
fabpot/Twig
fabpot/Twig-extensions
jdorn/sql-formatter
KnpLabs/KnpMenu
KnpLabs/KnpMenuBundle
kriswallsmith/assetic
l3pp4rd/DoctrineExtensions
liip/LiipImagineBundle
php-fig/log
schmittjoh/cg-library
schmittjoh/JMSAopBundle
schmittjoh/JMSDiExtraBundle
schmittjoh/JMSSecurityExtraBundle
schmittjoh/metadata
schmittjoh/parser-lib
schmittjoh/php-option
Seldaek/monolog
sensiolabs/SensioDistributionBundle
sensiolabs/SensioFrameworkExtraBundle
sensiolabs/SensioGeneratorBundle
sensio/SensioDistributionBundle
sensio/SensioFrameworkExtraBundle
sensio/SensioGeneratorBundle
stof/StofDoctrineExtensionsBundle
swiftmailer/swiftmailer
symfony/symfony
symfony/AsseticBundle
symfony/MonologBundle
symfony/SwiftmailerBundle
symfony/BrowserKit
symfony/ClassLoader
symfony/Config
symfony/Console
symfony/CssSelector
symfony/Debug
symfony/DependencyInjection
symfony/DomCrawler
symfony/EventDispatcher
symfony/Filesystem
symfony/Finder
symfony/Form
symfony/HttpFoundation
symfony/HttpKernel
symfony/Locale
symfony/Intl
symfony/Icu
symfony/OptionsResolver
symfony/Process
symfony/PropertyAccess
symfony/Routing
symfony/Security
symfony/Serializer
symfony/Stopwatch
symfony/Templating
symfony/Translation
symfony/Validator
symfony/Yaml
sonata-project/exporter
sonata-project/SonataAdminBundle
sonata-project/SonataBlockBundle
sonata-project/SonataCacheBundle
sonata-project/sonata-doctrine-extensions
sonata-project/SonataDoctrineORMAdminBundle
sonata-project/SonatajQueryBundle
fzaninotto/Faker

I hope this will help you get started and save you loads of time in your development and release process.


< Return to main page


You're looking at a very minimalistic archived version of this website. I wish to preserve to content, but I no longer wish to maintain Wordpress, nor handle the abuse that comes with that.