Full-Text Indexing PDFs in Javascript
I once worked for a company that sold access to legal and financial databases (as they call it, “intelligent information“). Most court records are PDFS available through PACER, a website developed...
View ArticleParsing PDFs at Scale with Node.js, PDF.js, and Lunr.js
Technologies used: Vagrant + Virtualbox, Node.js, node-static, Lunr.js, node-lazy, phantomjs Much information is trapped inside PDFs, and if you want to analyze it you’ll need a tool that extracts the...
View ArticleFunctional Programming Patterns in Four Popular Javascript Libraries
I generally find discussions of design patterns a bit dry, but in testing new Javascript libraries, I’ve stumbled across some interesting tactics. Object oriented design patterns are typical not a...
View ArticleParsing Javascript in Javascript
Closures are a language feature which allow the programmer to inject static variables into a function’s scope, without the use of method parameters. E.g.: a = "b"; function c() { return a; } c() "b" A...
View ArticleData Exploration in Javascript
Google Analytics has a nice screen which shows alerts for changes that appear interesting – basically any large increase or decrease in traffic from a particular source: With appropriate API hooks,...
View ArticleIdentifying important keywords using Lunr.js and the Blekko API
Lunr.js is a simple full-text engine in Javascript. Full text search ranks documents returned from a query by how closely they resemble the query, based on word frequency and grammatical considerations...
View ArticleJavascript to remove line number, author, revision columns from Fisheye/Crucible
Fisheye puts a bunch of useful columns in code reviews, but they’re irritating if you want to copy code out, because they copy too: I’ve found it helpful to create bulk reviews to view patches, where...
View ArticleFixing the error “TypeError: ‘undefined’ is not a function (evaluating...
Some libraries, like PDF.js, initialize their own logging function, which wraps console.log1. If this runs in a context where function.bind does not exist, you’ll get the following error: TypeError:...
View ArticleValidating Application Performance in a cloud environment, using C#,...
The rise of “platform” sites (e.g. Heroku) enables developers to build and deploy web applications cheaply, without understanding operational problems. Typically these products let you purchase a...
View ArticleSequelize Update example
There are a couple ways to update values in the Sequelize ORM: db.Alert.update( { url: url }, { fields: ['url'], where: {id: id} } ); If you have an object, this is also supposed to work: var alert =...
View Article