Npm Needs a Personal Trainer

Corey Butler
Author.io
Published in
7 min readFeb 9, 2015

It’s hard to find developers who don’t know what npm is. With 350K+ unique modules in store at the beginning of 2017, it’s the “go to” resource for node.js. But npm is overweight for it’s age.

Childhood Obesity is a Problem… for npm

Npm is growing by leaps and bounds. Some growth is healthy, some is not. Like a child, it needs a proper diet of good practices and developers who exercise them.

350K is a lot of modules, but is that too many? It’s hard to tell. There are lots of unique cases where a slight variation of a module fits one use case better than another. Perhaps there is some room to trim unnecessary modules, but there are other things you can do.

Instead of worrying about the number of modules, developers should focus on making their own modules healthier. Everyone pitches in for a healthier community.

In meetup presentations and tech talks, it’s common to hear a developer pitch the ease of use their module offers as a way to convince you it’s the best thing since sliced bread.

You just

npm install awesome-sauce 

… and you’re off to the races! Right? Maybe not.

Have you looked at the code this awesome module downloaded into your project folder? Look carefully. Some IDE’s hide the node_modules folder from view to present the appearance of a clean project. This is like black clothing… slimming, but still a facade.

There are many modules with excessive “extras” delivered to npm. In my personal experience, I’ve seen numerous modules containing example folders, documentation directories, and all sorts of other stuff. In some cases, the functional JavaScript files collectively weigh in under 100Kb, accompanied by 20MB of documentation and example files.

The Node community needs to minimize module footprints.

What Difference Does it Make?

Underscore is a popular module. According to today’s stats, it was downloaded 87,920 times yesterday alone. It was downloaded 4.5M times in the last month. It’s a popular dependency. Now imagine if 1.5Kb was stripped out of the library (which happens to be about the size of the README file). 4.5M x 1.5Kb = ~6.4GB/mo.

I picked on underscore specifically because of it’s popularity and its relatively good job of ignoring extra files, though at the time of this writing, a straight npm installation still includes both the minified (16Kb) and unminified (47Kb) versions of the library, plus the LICENSE and README files at 2Kb each.

Publish Only What You Need

There are two primary approaches for minimizing the footprint of an npm module. Whitelisting is accomplished with the package.json file, while blacklisting is accomplished with the .npmignore file.

Slimmer Pickings: Whitelist files in package.json

Whitelisting is the most effective/robust way to reduce your npm footprint. Specify only the relevant resources in the package.json files section.

Certain files are always included, regardless of settings:

  • package.json
  • README (and its variants)
  • CHANGELOG (and its variants)
  • LICENSE / LICENCE

Conversely, some files are always ignored:

  • .git
  • CVS
  • .svn
  • .hg
  • .lock-wscript
  • .wafpickle-N
  • *.swp
  • .DS_Store
  • ._*
  • npm-debug.log

By explicitly identifying the necessary files, developers guarantee slimmer packages, even if something slips into the code base by mistake (like an examples directory).

The Ignored Diet: Blacklisting Files

Npm has an ironically often ignored .npmignore capability. If the package.json “files” attribute is for whitelisting, .npmignore is the equivalent of blacklisting.

This file is similar to the .gitignore file. It prevents specific files and folders from being published to the npm registry. It’s simple and makes your modules lean. You should be using it in every module you publish if you’re not using whitelisting.

Here’s an example of a common ignore file I use in my modules:

_*
.*
*.log
*.md
*.yml
examples
docs
test

This file prevents tests, examples, and docs from shipping. It also prevents things like CI/CD configurations, extra markdown (remember README is always included no matter what you do), pesky development logs, files like .gitignore, and all files that start with an underscore. The underscore is a personal convention I use, such as a _todo directory.

For more examples, see my npm profile or the sources code at github.com/coreybutler.

What REALLY Needs to be Published to npm?

For node use, is it really necessary to have both minified and unminified versions of a file? Is the README really necessary? How many copies of this are on your production server? Are you running your projects on a low-end $5 virtual machine or a free OpenShift instance? While space is often cheap, don’t abuse it.

Some files offer convenience, but I’d argue people aren’t even looking at the contents of a package. When they do, they’re usually browsing through it on Github. If someone goes to the effort to create a large amount of documentation, take the final step and publish it as a Github page, or create a wiki. Just get it out of npm.

Cutting the Cruft

The point is to use reasonable judgement in determining what’s really necessary in a published module. Here’s a starter list of things you might be able to get rid of.

Any dotfile (.gitignore, .jshintrc, .editorconfig, etc)

While these are typically very small files, they don’t usually provide functional value to the published module. It’s easy to remove them all in one swoop by adding the following to .npmignore:

.*

Any Markdown File

Again, these files typically aren’t providing functional value to a module. Host these on a Github page/wiki instead. This includes the large README file content, CONTRIBUTOR/CONTRIBUTING, etc.

The License File?

This one is trickier, because some licenses require distribution with the code. If in doubt, leave this one in your code base.

GOTCHA!

Remember, npm will include certain files regardless of whether they’re ignored. This includes LICENSE, README, & CHANGELOG (and variants). The only real option is to not have these files at all, or reduce their content. For example, provide a link to your Github wiki/page instead of adding the content to the README. This is a shame since sites like Github do so many nice things automatically with the README, yet so many servers get cluttered with this completely unnecessary addition to the production environment.

Unit Tests

Developers sometimes think people are running their carefully crafted test suites, but the reality is most don’t even know tests exist. Caring that tests exist is even less likely. Most developers only run a test suite when there’s a problem. Less experienced/patient developers will skip your module in favor of another that works as they expect. If they do care about unit tests, they’ll look at your Github repository or the status of your CI service. You are using CI, right?

If you’re not using Travis or another service for your open source modules, it’s worth taking a look. Travis, Shippable, CodeShip, AppVeyor, and others offer free CI services for open source projects.

Since you’re a responsible developer using a CI service, you should add these configuration files to .npmignore, such as .yml or .yaml files.

In the rare situation a developer actually wants to run your test suite on their own local computer, they’ll likely clone or fork it. The bottom line is tests really don’t need to be published to npm.

Examples & Documentation

Many people create an “examples” folder to demonstrate how to use the module. Again, the common workflow of the average developer usually does not include running examples out of an npm deployment. They’ll visit the public page (Github/BitBucket/Whatever) for help. Like tests, if they want to run the examples, they’ll clone the project.

Project & Editor Files

Any other project/editor specific files don’t need to be deployed to npm. This includes build files. The module may have an awesome grunt or gulp process to streamline something, but unless you’re distributing a grunt/gulp plugin, you don’t need to include these files in your npm package.

Exceptions to the Rule

There are going to be exceptions to the rule. For example, Yeoman generators will include a lot of template files that may fit some of the suggested ignore patterns above. Use reasonable judgement. Think about how people will actually use your module.

Trimming Fat Dependency Chains

So far, this article has focused mostly on stripping unnecessary files out of published npm modules. That’s the least developers can do. There are other “exercises” to make modules strong.

Dependency chains can get pretty long. Some get so long Windows users are greeted with “path too long” error messages when attempting to delete them. This is ridiculous. Operating systems aside, if you need to troubleshoot a deeply nested module within the node_modules directory, it’s a bit painful.

Dependency chains can be simplified through the process of flattening. This concept means moving a module “up the chain”. For example, examine the following node_modules dependency chain:

node_modules
module-a
module-b
module-c
module-d
module-e
module-b
module-c
module-d
module-f
module-g
module-b
module-c
module-d
module-h
module-b
module-c
module-d

Module “B” depends on “C”, which depends on “D”. Modules E/G/H also depend on module “B” and it’s entire dependency chain. This chain should be flattened by moving Module “B” up, resulting in:

node_modules
module-a
module-e
module-f
module-g
module-h
module-b
module-c
module-d

When node cannot find a dependency, it looks “up” the chain for the next node_modules directory. Both examples function the same way, but the dependency chain is smaller in the second. There are fewer copies of the same module, reducing the overall module footprint.

UPDATE: There are some tools for flattening dependency chains, like npm-dedupe.

The Developer Experience

Some folks believe the mere act of publishing a module to npm will attract contributors and help their project flourish. Not really. That discussion could easily be a completely different article, but if you’re ultimately looking for contributors, it’s easier to attract them with lean modules. The code structure should make sense, run lean, and not require hunting through a mountain of dependencies.

Happy lean publishing, and let’s make npm fit for all of us! Now I have to go clean up some of my modules :-)

Written by Corey Butler

I build communities, companies, and code. Cofounder of Author Software, Inc. Created Fenix Web Server & NVM for Windows.

Responses (2)

What are your thoughts?

I really liked the section, “Trimming Fat Dependency Chains.” I was not aware that modules will look ‘up the chain.’ Do you know of any tools that can help out with flattening?

--

Having the README and tests in disk is quite useful because otherwise you need to go to github and search for the specific version you have in disk in order to read the documentation or tests. Plus a lot of module authors doesn’t push the tags to…

--