Sourcehunt: Idea of the Month and 6 Interesting Repos!

Bruno Skvorc
Bruno Skvorc

It’s time for our monthly hunt for new open source libraries to use and contribute to!

If you’re new to Sourcehunt, it’s our monthly post for promoting open source projects that seem interesting or promising and could use help in terms of Github stars or pull requests.

It’s our way of giving back – promoting projects that we use (or could use) so that they gain enough exposure to attract a wider audience, a powerful community and, possibly, new contributors or sponsors.

Sourcehunt logo

genkgo/archive-stream [5 ★]

We stumbled upon this package following this very interesting read about streamed file zipping and downloading in PHP.

This package provides a memory efficient API for streaming ZIP files as PSR 7 messages. It comes with certain limitations, namely that only the Zip64 (version 4.5 of the Zip specification) format is supported, and files cannot be resumed if a download fails before finishing, but apart from that, using it is incredibly straightforward if you’re already wading through the PSR-7 waters:

$archive = (new Archive())
    ->withContent(new CallbackStringContent('callback.txt', function () {
        return 'data';
    ->withContent(new StringContent('string.txt', 'data'))
    ->withContent(new FileContent('file.txt', 'local/file/name.txt'))
    ->withContent(new EmptyDirectory('directory'));

$response = $response->withBody(
    new Psr7Stream(new ZipReader($archive))

The library, while having many contributors who have done a professional job of fool-proofing it, and being over 8 months old, lacks exposure. Oddly enough, there are no issues (the issues feature itself is disabled) and no PRs to review, but the authors could always use more testers, tech writers to present the lib in blogs, and maintainers.

How do you usually send ZIPs, if at all? Why not give this a go?

patrickschur/language-detection [270 ★]

Patrick originally contacted us via our Facebook page asking to be mentioned in this month’s sourcehunt, and given the excellence of this package, we couldn’t resist.

The package deals with using specific training materials to find the frequency of occurring words, and matching that with a test sample to detect its language. As per the README:

This library can detect the language of a given text string. It can parse given training text in many different idioms into a sequence of N-grams and builds a database file in JSON format to be used in the detection phase. Then it can take a given text and detect its language using the database previously generated in the training phase. The library comes with text samples used for training and detecting text in 106 languages.

The library is well documented, well tested, and well built – it’s very straightforward to use, and really simple to add more training data to it.

The package could use some in depth tutorials and maintainers, so if you’re up for a low-maintenance gig, sign up! [195 ★]

Used to extract sentiment from text. This can be positive, neutral, or negative. It’s like a very basic, free, open source version of Semantria.

The library has been out for a few years, and hasn’t been updated in a couple, but there’s a reason why we’re sharing it: a) it’s really interesting and fun to play with, and b) see below.

In the meanwhile, the library could use some help with several issues, and there’s always tweaking for precision to be done. Help out if you can!

isfonzar/sentiment-thermometer [42 ★]

Using the PhpInsight library above, this one goes to Twitter, searches for the provided phrase or term, and measures Twitter’s sentiment towards it. This means it’ll grab the tweets that talk about this, analyze them for sentiment, and return an overall result.

Now, you might argue that, in the demo, a Donald Trump twitter search yielding 40% positive and only 20% negative sentiment doesn’t make much sense, and the author did say the sentiment varies according to time of day and whatnot (maybe he tested when it was mid-day in Russia?), but obviously, the library could use some tweaking and fine tuning, so go ahead and jump right in. It could also use the ability to look up other social networks – Facebook’s public posts might be an interesting one, for example.

paragonie/PHP-Cookie [7 ★]

There’s a new CSRF killer in town, and paragonIE’s PHP-Cookie package is the first to truly utilize it.

In case you missed it, Cookies now support the SameSite directive in newer browsers, effectively killing CSRF in all modern clients. That still leaves old browsers vulnerable, so you definitely shouldn’t depend JUST on SameSite just yet, but it’s a step in the right direction and all we have to do is wait for adoption.

PHP-Cookie takes this a step further and adds the functionality in as a PHP package – you can now easily set SameSite cookies from PHP. In this lib:

  • Secure is set to TRUE unless you override it.
  • HTTP-Only is set to TRUE unless you override it.
  • Same-Site is set to Strict unless you override it.

Note that PHP 7+ is a requirement.

If you’re interested, there’s also a suggestion to implement this into native PHP:

kamranahmedse/design-patterns-for-humans [7718 ★]

Last but not least, a non-code repo. This awesome compilation of human-readable explanations of design patterns blew up on Reddit and Hacker News over the weekend, and has been growing ever since.

The author is a non-native English speaker, so there’s still quite a few typos to fix and phrases to correct, and additional examples of design patterns are also handy. But in all honest, this repo has enough inertia to survive without further promotion – we just wanted to bring it to your attention, in case you missed it.

That’s it for February. Found anything you could sink your teeth into?

As always, please throw your links at us with the #sourcehunt hashtag! If you build something with the projects we’ve mentioned, or if you submit an elaborate pull request you’d like to talk about, give us a shout and we’ll make sure the world knows about it!

App+Tutorial idea of the month: Add Facebook public posts into the sentiment thermometer, add language detection into the mix, use Google Translate to translate non-English posts, and then grab sentiment of all those. Build a graph of sentiment for a term every hour in a day for several days/weeks in a row, and find trends of liking and disliking a phrase. Add this mix to a dashboard where all a user has to do is define a term, and a widget is generated with trend graphs. Bonus points if terms and their values are saved for later, so that if another user requests, for example, Donald Trump, the data is already there and reused, saving bandwidth and API calls.

Happy coding!