Huginn: An Open-Source, Self-Hosted IFTTT
Join the DZone community and get the full member experience.
Join For FreeAs developers, we don’t have the time or patience for routine tasks. We like to get things done, and any tools that can help us automate are high on our radar.
Enter Huginn, a workflow-automation server similar to Zapier or IFTTT — but open source. With Huginn, you can automate tasks, such as watching for air travel deals, continually watching for certain topics on Twitter, or scanning for sensitive data in your code.
Recently, a post about Huginn hit the top of Hacker News. This piqued my interest. I wanted to see why it’s so popular, what it’s all about, and what it’s being used for.
How Huginn Started
I reached out to Huginn’s creator, Andrew Cantino, to ask him why he started it.
“I started the project in 2013 to scratch my own itch — I wanted to scrape some websites to know when they changed (web comics, movie trailers, local weather forecasts, Craigslist sales, eBay, etc.), and I wanted to be able to automate simple reactions to those changes. I’d been interested in personal automation for a while, and Huginn was initially a quick project I built over the Christmas holidays that year.”
However, that simple Christmas-holiday project quickly grew.
Today, Huginn is a community-driven project with hundreds of contributors and thousands of users. Andrew still uses Huginn for its original use case:
“I still primarily use Huginn for this purpose: It tells me about upcoming yard sales, if I should bring an umbrella today because of rain in the forecast, when rarely updated blogs have changed, when certain words spike on Twitter, etc. I also have found it very useful for sourcing information for the weekly newsletter that I write about the space industry, called ‘The Orbital Index.’”
However, the community has found a wider range of uses. So let’s look at exactly what Huginn is, how to set it up, and how to use it to automate your everyday life.
How Huginn Works
Hugginn is a web-based scheduling service that runs workers called Agents. Each Agent performs a specific function, such as sending an email or requesting a website. Agents generate and consume JSON payloads called events, which can be used to chain Agents together. Agents can be scheduled, or executed manually.
Getting Started
It’s easy to deploy Huginn with just one click using the Deploy to Heroku button. Huginn also supports Docker and Docker Compose, manual installation on Linux, and many other deployment methods. After installing, you can extend Huginn by using one of the many available Agent Gems or by creating your own.
Here’s an existing Agent that pulls the latest comic from xkcd.com. You can see the basic stats of the Agent (last checked, last created, and so on). The Options field shows how the Agent is configured, including the CSS selectors used to extract data from the page.
Scenarios
You can also organize Agents into Scenarios, which allow you to group similar Agents as well as import and export Agent configurations as JSON files. You can also fine-tune Agent scheduling and configuration using special Agents called Controllers. Here we see a Scenario build around the theme of Entertainment.
Dynamic Content
Lastly, Huginn uses the Liquid templating engine, which allows you to load dynamic content into Agents. This is commonly used to store configuration data (such as credentials) separately from Agents.
Here, it’s used to format the URL, title, and on-hover text from the XKCD Source Agent as HTML:
Why Would I Use Huginn?
In addition to web scraping, Huginn supports a wide variety of actions that can allow for some truly complex workflows. Disclaimer: Many sites disallow automated web scraping. Be sure to check the terms of service (TOS) of any website you intend to access using Huginn.
Some of the examples from the GitHub page include:
• Watch for air travel or shopping deals
• Follow your project names on Twitter and get updates when people mention them
• Connect to Adioso, HipChat, Basecamp, Growl, FTP, IMAP, Jabber, JIRA, MQTT, nextbus, Pushbullet, Pushover, RSS, Bash, Slack, StubHub, translation APIs, Twilio, Twitter, Wunderground, and Weibo, to name a few.
• Send digest emails with things that you care about at specific times during the day.
• Track counts of high-frequency events and send an SMS within moments when they spike.
• Send and receive WebHooks.
• Run custom JavaScript or CoffeeScript functions.
• Track your location over time.
• Create Amazon Mechanical Turk workflows as the inputs, or outputs, of agents (the Amazon Turk Agent is called the “HumanTaskAgent”). For example: “Once a day, ask 5 people for a funny cat photo; send the results to 5 more people to be rated; send the top-rated photo to 5 people for a funny caption; send to 5 final people to rate for funniest caption; finally, post the best-captioned photo on my blog.&rdquo
Monitoring Social Networks
Huginn supports several social networks, including Twitter and Tumblr. These Agents can watch for new posts, trending topics, and updates from other users.
Let’s say you live in a hurricane-prone area and want to follow the impact of a storm. Using a Twitter Stream Agent, you can watch for Tweets containing “hurricane,” “storm,” and so on, and pass the results to a Peak Detector Agent. This counts Tweets over a period of time, measures the standard deviation, and fires an event if it detects an outlier. You can have this event trigger an Email Agent that notifies you immediately. Andrew Cantino explains this use case in more detail on his blog.
Price Shopping
Huginn makes an excellent online shopping tool. When shopping for the best deal, create Website Agents to run daily searches on discount and trading sites. Use Event Formatting Agents to extract prices, then use a Change Detector Agent to compare the last retrieved price to the current price. If it’s lower, you can extract the item URL and send it straight to your inbox.
Security Alerts
Staying on top of security updates is a continuous process. You can use Huginn to watch the National Vulnerability Database for CVEs affecting your systems and notify you immediately. If you want to filter the results (e.g., only show high-priority alerts), you can use a Trigger Agent to only allow results where the severity is above a certain value.
Advanced Use Cases
Huginn comes with some powerful Agents that greatly extend its capabilities beyond web scraping.
Data Processing and Validation
Huginn can read files stored on the host, making it a useful data-processing tool. Let’s say you’re testing changes to a codebase, and before you commit, you want to scan for any sensitive data you might have left in during testing. You can create a Local File Agent to scan your project directory, pass the contents to an Event Formatting Agent, and use regular expressions to detect credentials, passwords, and similar strings.
Alternatively, you could use a Shell Command Agent to call a utility like repo-supervisor and fire a desktop notification when it detects matches.
Newsroom Automation
One of Huginn’s first great successes was its adoption by the New York Times to automate newsroom tasks.
During the 2014 Winter Olympics, Huginn monitored their data-pipeline availability and sent notifications when medals were awarded.
Huginn also notified reporters when new stories published and updated a Slack channel when content changed on nytimes.com. You can learn more about their use cases at Huginn for Newsrooms.
Conclusion
Huginn is a deceptively simple tool with a lot of flexibility. The best way to see what it can do is to try it yourself.
Published at DZone with permission of Michael Bogan. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments