These are the news items I've curated in my monitoring of the API space that have some relevance to the API definition conversation and I wanted to include in my research. I'm using all of these links to better understand how the space is testing their APIs, going beyond just monitoring and understand the details of each request and response.22 Oct 2018
I spend a lot of time helping enterprise organizations discover their APIs. All of the organizations I talk to have trouble knowing where all of their APIs are–even the most organized of them. Development and IT groups have just been moving too fast over the last decade to know where all of their web services, and APIs are. Resulting in large organizations not fully understanding what all of their capabilities are, even if it is something they actively operate, and may drive existing web or mobile applications.
Each individual API within the enterprise represents a single capability. The ability to accomplish a specific enterprise tasks that is valuable to the business. While each individual engineer might be aware of the capabilities present on their team, without group wide, and comprehensive API discovery across an organization, the extent of the enterprise capabilities is rarely known. If architects, business leadership, and any other stakeholder can’t browse, list, search, and quickly get access to all of the APIs that exist, the knowledge of the enterprise capabilities will not be able to be quantified or articulated as part of regular business operations.
In 2018, the capabilities of any individual API is articulated by it’s machine readable definition. Most likely OpenAPI, but could also be something like API Blueprint, RAML, or other specification. For these definitions to speak to not just the technical capabilities of each individual API, but also the business capabilities, they will have to be complete. Utilizing a higher level strategic set of tags that help label and organize each API into a meaningful set of business capabilities that best describes what each API delivers. Providing a sort of business capabilities taxonomy that can be applied to each API’s definition and used across the rest of the API lifecycle, but most importantly as part of API discovery, and the enterprise digital product catalog.
One of the first things I ask any enterprise organization I’m working with upon arriving, is “do you know where all of your APIs are?” The answer is always no. Many will have a web services or API catalog, but it almost always is out of date, and not used religiously across all groups. Even when there are OpenAPI definitions present in a catalog, they rarely contain the meta data needed to truly understand the capabilities of each API. Leaving developer and IT operations existing as black holes when it comes to enterprise capabilities, sucking up resources, but letting very little light out when it comes to what is happening on the inside. Making it very difficult for developers, architects, and business users to articulate what their enterprise capabilities are, and often times reinventing the wheel when it comes to what the enterprise delivers on the ground each day.
Everyone wants their OpenAPIs to be complete, but what that really means will depend on who you are, what your knowledge of OpenAPI is, as well as being driven by your motivation for having an OpenAPI in the first place. I wanted to take a crack at articulating a complete(enough) definition for OpenAPIs I create, based upon what I’m needing them to do.
Info & Base - Give the basic information I need to understand who is behind, and where I can access the API.
Paths - Provide an entry for every path that is available for an API, and should be included in this definition.
Parameters - Provide a complete list of all path, query, and header parameters that can be used as part of an API. https://gist.github.com/kinlane/29d0247d6ff4aaa39db4dc793df4a2f9
Descriptions - Flesh out descriptions for all the path and parameter descriptions, helping describe an API does.
Enums - Publish a list of all the enumerated values that are possible for each parameter used as part of an API. https://gist.github.com/kinlane/444731f0214cab5efcc3ae77011823ba
Definitions - Document the underlying schema being returned by creating a JSON schema definition for the API.
Responses - Associate the definition for the API with the path using a response reference, connecting the dots regarding what will be returned.
Tags - Tag each path with a meaningful set of tags, describing what resources are available in short, concise terms and phrases.
Contacts - Provide contact information for whoever can answer questions about an API, and provide a URL to any support resources.
Create Security Definitions - Define the security for accessing the API, providing details on how each API request will be authenticated.
Apply Security Definitions - Apply the security definition to each individual path, associating common security definitions across all paths.
Complete(enough) - That should give us a complete (enough) API description.
Obviously there is more we can do to make an OpenAPI even more complete and precise as a business contract, hopefully speaking to both developers and business people. Having OpenAPI definitions are important, and having them be up to date, complete (enough), and useful is even more important. OpenAPIs provide much more than documentation for an API. They provide all the technical details an API consumer will need to successfully work with an API.
While there are obvious payoffs for having an OpenAPI, like being able to publish documentation, and generate code libraries. There are many other uses for an OpenAPI like loading into Postman, Stoplight, and many other API services and tooling that helps developers understand what an API does, and reduce friction when they integrate, and have to maintain their applications. Having an OpenAPI available is becoming a default mode of operation, and something every API provider should have.
I reach out to API providers on a regular basis, asking them if they have an OpenAPI or Postman Collection available behind the scenes. I am adding these machine readable API definitions to my index of APIs that I monitor, while also publishing them out to my API Stack research, the API Gallery, APIs.io, work to get them published in the Postman Network, and syndicated as part of my wider work as an OpenAPI member. However, even beyond my own personal needs for API providers to have a machine readable definition of their API, and helping them get more syndication and exposure for their API, having an definition present significantly reduces friction when on-boarding with their APIs at almost every stop along a developer’s API integration journey.
One of the API providers I reached out to recently responded with this, “I spoke with one of our engineers and he asked me to refer you to https://developer.[company].com/”. Ok. First, I spend over 30 minutes there just the other day. Learning about what you do, reading through documentation, and thinking about what was possible–which I referenced in my email. At this point I’m guessing that the engineer in question doesn’t know what an OpenAPI or Postman Collection is, they do not understand the impact these specifications are having on the wider API ecosystem, and lastly, I’m guessing they don’t have any idea who I am(ego taking control). All of which provides me with the signals I need to make an assessment of where any API is in their overall journey. Demonstrating to me that they have a long ways to go when it comes to understanding the wider API landscape in which they are operating in, and they are too busy to really come out of their engineering box and help their API consumers truly be successful in integrating with their platform.
I see this a lot. It isn’t that I expect everyone to understand what OpenAPI and Postman Collections are, or even know who I am. However, I do expect people doing APIs to come out of their boxes a little bit, and be willing to maybe Google a topic before responding to question, or maybe Google the name of the person they are responding to. I don’t use a gmail.com address to communicate, I am using apievangelist.com, and if you are using a solution like Clearbit, or other business intelligence solution, you should always be retrieving some basic details about who you are communicating with, before you ever respond. That is, you do all of this kind of stuff if you are truly serious about operating your API, helping your API consumers be more successful, and taking the time to provide them with the resources they need along the way–things like an OpenAPI, or Postman Collections.
Ok, so why was this response so inadequate?
- No API Team Present - It shows me that your company doesn’t have any humans their to support the humans that will be using your API. My email went from general support, to a backend engineer who doesn’t care about who I am, and about what I need. This is a sign of what the future will hold if I actually bake their API into my applications–I don’t need my questions lost between support and engineering, with no dedicated API team to talk to.
- No Business Intelligence - It shows me that your company has put zero thought into the API business model, on-boarding, and support process. Which means you do not have a feedback loop established for your platform, and your API will always be deficient of the nutrients it needs to grow. Always make sure you conduct a lookup based upon on the domain, or Twitter handle or your consumers to get the context you need to understand who you are talking to.
- Stuck In Your Bubble - You aren’t aware of the wider API community, and the impact OpenAPI, and Postman are having on the on-boarding, documentation, and other stops along the API lifecycle. Which means you probably aren’t going to keep your platform evolving with where things are headed.
Ok, so why should you have an OpenAPI and Postman Collection?
- Reduce Onboarding Friction - As a developer I won’t always have the time to spend absorbing your documentation. Let me import your OpenAPI or Postman Collection into my client tooling of choice, register for a key and begin making API calls in seconds, or minutes. Make learning about your API a hands on experience, something I’m not going to get from your static documentation.
- Interactive API Documentation - Having a machine readable definition for your API allows you to easily keep your documentation up to date, and make it a more interactive experience. Rather than just reading your API documentation, I should be able to make calls, see responses, errors, and other elements I will need to truly understand what you do. There are plenty of open source interactive API documentation solutions that are driven by OpenAPI and Postman, but you’d know this if you were aware of the wider landscape.
- Generate SDKs, and Other Code - Please do not make me hand code the integration with each of your API endpoints, crafting each request and response manually. Allow me to autogenerate the most mundane aspects of integration, allowing OpenAPI and Postman Collection to act as the integration contract.
- Discovery - Please don’t expect your potential consumers to always know about your company, and regularly return to your developer.[company].com portal. Please make your APIs portable so that they can be published in any directory, catalog, gallery, marketplace, and platform that I’m already using, and frequent as part of my daily activities. If you are in my Postman Client, I’m more likely to remember that you exist in my busy world.
These are just a few of the basics of why this type of response to my question was inadequate, and why you’d want to have OpenAPI and Postman Collections available. My experience on-boarding will be similar to that of other developers, it just happens that the application I’m developing are out of the normal range of web and mobile applications you have probably been thinking about when publishing your API. But this is why we do APIs, to reach the long tail users, and encourage innovate around our platforms. I just stepped up and gave 30 minutes of my time (now 60 minutes with this story) to learning about your platform, and pointing me to your developer.[company].com page was all you could muster in return?
Just like other developers will, if I can’t onboard with your API without friction, and I can’t tell if there is anyone home, and willing to give me the time of day when I have questions, I’m going to move on. There are other platforms that will accommodate me. The other downside of your response, and me moving on to another platform, is that now I’m not going to write about your API on my blog. Oh well? After eight years of blogging on APIs, and getting 5-10K page views per day, I can write about a topic or industry, and usually dominate the SEO landscape for that API search term(s) (ego still has control). But…I am moving on, no story to be told here. The best part of my job is there are always stories to be told somewhere else, and I get to just move on, and avoid the friction wherever possible when learning how to put APIs to work.
I just needed this single link to provide in response to my email response, before I moved on!
This is a topic I’ve wanted to set in motion for some time now. I had a new university professor city my work again as part of one of their courses recently, something that floated this concept to the top of the pile again–API discovery collections meant for just for students. Helping k-12, community college, and university students quickly understand where to find the most relevant APIs to whatever they are working on. Providing human, but also machine readable collections that can help jumpstart their API education.
I use the API discovery format APIs.json to profile individual, as well as collections of APIs. I’m going to kickstart a couple of project repos, helping me flesh out a handful of interesting collections that might help students better understand the world of APIs:
- Social - The popular social APIs like Twitter, Facebook, Instagram, and others.
- Messaging - The main messaging APIs like Slack, Facebook, Twitter, Telegram, and others.
- Rock Star - The cool APIs like Twitter, Stripe, Twilio, YouTube, and others.
- Amazon Stack - The core AWS Stack like EC2, S3, RDS, DynamoDB, Lambda, and others.
- Backend Stack - The essential App stack like AWS S3, Twilio, Flickr, YouTube, and others.
I am going to start there. I am trying to provide some simple, usable collections or relevant APIs for students are just getting started If there are any other categories, or stacks of APIs you think would be relevant for students to learn from I’d love to hear your thoughts. I’ve done a lot of writing about educational and university based APIs, but I’ve only lightly touched upon what APIs should students be learning about in the classroom.
Providing ready to go API collections will be an important aspect of the implementation of any API training and curriculum effort. Having the technical details of the API readily available, as well as the less technical aspects like signing up, pricing, terms of service, privacy policies, and other relevant building blocks should also be front and center. I’ll get to work on these five API discovery collections for students. Get the title, description, and list of each API stack published as a README, then I’ll get to work on publishing the machine, and human readable details for the technology, business, and politics of using APIs.
I’m evolving the search for the Streamdata.io API Gallery I’ve been working on lately. I’m looking to move the basic keywords search that searches the API name and description, as well as the API path, summary, and description using a key word or phrase, to also be about searching parameters in a meaningful way. Each of the APIs in the Streamdata.io API have an OpenAPI definition. It is how I render each of the individual API paths using Jekyll and Github Pages. These parameters give me another dimension of data in which I can index, and use as a facet in my API gallery search.
I am developing different sets of vocabulary to help me search against the parameters used across APIs, with one of them being focused on company related information. I’m trying to find APIs that provide the ability to add, update, and search against company related data, content, and execute algorithms that help make sense of company resources. There is no perfect way to search for API parameters that touch on company resources, but right now I’m looking for a handful of fields: company, organization, business, enterprise, agency, ticker, corporate, and employer. Returning APIs that have a parameter with any of those words in the path or summary, and weighting differently if it is in the description or tags for each API path.
Next, I’m also tagging each API path that has a URL field, because this will allow me to connect the dot to a company, organization, or other entity via the domain. This is all I’m trying to do, is connect the dots using the parameter structure of an API. I find that there is an important story being told at the API design layer, and API search and discovery is how we are going to bring this story out. Connecting the dots at the corporate level is just one of many interesting stories out there, just waiting to be told. Pushing forward the conversation around how we understand the corporate digital landscape, and what resources they have available.
You can do a basic API search at the bottom of the Streamdata.io API Gallery main page. I do not have my parameter search available publicly yet. I want to spend more time refining my vocabularies, and also look at searching the request and response bodies for each path–I’m guessing this won’t be as straightforward, as parameters has been. Right now I’m immersed in understanding the words we use to design our APIs, and craft our API documentation. It is fascinating to see how people describe their resources, and how they think (or don’t think) about making these resources available to other people. OpenAPI definitions provide a fascinating way to look at how APIs are opening up access to company information, establishing the digital vocabulary for how we exchange data and content, and apply algorithms to help us better understand the business world around us.
APIs come in many shapes and sizes. Even when APIs may share a common resource, the likelihood that they are similar in functional, will be slim. Even after eight years of studying APIs, I still struggle with understanding the differences, and putting APIs into common buckets. Think of the differences between two image APIs like Flickr and Instagram, but then also think about the difference between Twitter and Twilio–the differences are many, and a challenge to articulate.
I’m pushing forward my API Stack, and API Gallery work, and I’m needing to better organize APIs into meaningful groups that I can add to the search functionality for each of API discovery services. To help me establish a handful of new buckets, I’m thinking more critically about the different types of API functionality I’m coming across, establishing seven new buckets:
- General Data - You can get at data across the platform, users, and resources.
- Relative Data - You can get at data that is relative to a user, company, or specific account.
- Static Data - The data doesn’t change too often, and will always remain fairly constant.
- Evolving Data - The data changes on a regular basis, providing a reason to come back often.
- Historical Data - Provides access to historical data, going back an X number of. years.
- Service - The API is offered as a service, or is provided to extend a specific service.
- Algorithmic - The API provides some sort of algorithmic functionality like ML, or otherwise.
Understanding the type of data an API provides is important to the work I’m doing. Streamdata.io caters to the needs of financial organizations, and they are looking for data to help them with their investment portfolio, but also have very particular opinions around the type of data they want. This first version of my API type list is heavily weighted towards data, but as I evolve in my thinking, I’m guessing the service and algorithmic buckets will expand and evolve as well.
The APIs I am cataloging within this work spring fit into one or many of these buckets. They are meant to transcend the resource being made available, and the provider behind the service. I want to be able to search, filter, and organize APIs across many of the usual characteristics we use to track on. I’m wanting to go beyond the obvious resource focused characteristics, and move beyond the technology being applied. I’m looking to understand what you can do with an API, and be able to stack hundreds, or thousands of similar APIs side by side, and provide a new view of the landscape.
I was working on a serverless app for Streamdata.io that takes posts to Hacker News and streams them into an Amazon S3 data lake, and I came across the Algolia powered Hacker News search API. After being somewhat frustrated with the simplicity of the official Hacker News API, I was pleased to find the search kindly provided by Algolia.
There is no search API available for the core Hacker News API, and the design leaves a lot to be desired, so the simplicity of Algolia’s API solution was refreshing. There is a lot of data flowing into Hacker News on a regular day, so providing a search API is pretty critical. Additionally, Algolia’s ability to deliver such a simple, usable, yet powerful API on top of a relevant data source like Hacker News demonstrates the utility of what Algolia offers as a search solution–something I wanted to take a moment to point out here on the blog.
I consider search to be an essential ingredient for any API. Every API should have a search element to their stack, allowing the indexing and searching of all API resources through a single path. Making Algolia a relevant API service provider in this area, enabling API providers to outsource the indexing and searching of their resources, and the delivery of a dead simple API for your consumers to tap into. This path forward is probably not for every API, as many weave specialized search throughout their API design, but for teams who are lacking in resources, and can afford to outsource this element–Algolia makes sense.
Seeing Algolia in action, for a specific API I was integrating with helped bring their service front and center for me. I tend to showcase Elastic for deploying API search solutions, but it is a good to receive a regular reminder that Algolia does the same thing as a service. Their work on the Hacker News Search API provides a good example of they can do for you–sure, we can all build our own search solutions, but honestly, do you have the time? I’ll make sure and regularly highlight what Algolia is doing as part of my search API research, and thanks Algolia! I really appreciate what you did for the Hacker News API, it made my work a lot easier.
I told the folks over at SAP that I would take a look at their API Business Hub. It isn’t paid work, just helping provide feedback on another addition to the API discovery front, something I’m pretty committed to helping push forward in any way that I can. They’ve pulled together a pretty clean, OpenAPI driven catalog of useful APIs for the enterprise, so I wanted to make sure I kick the tires and size it up alongside the other API discovery work I am doing.
The SAP API Business Hub is a pretty simple and clean catalog for searching and browsing applications, integrations, as well as APIs–I am going to focus in on the API section. Which at first glance looks to have about 70 separate APIs, but then you notice each of them are just umbrellas for each API platform, and some of them contain many different API endpoints. Some of the APIs are simple language translation and text extraction resources, while others provide robust access to the SAP S/4HANA Cloud, SAP Ariba, and other SAP systems. You see a lot of SAP focused solutions, but then you also see a handful of partner solutions added via their platform partner program.
I see the beginnings of a useful API catalog getting going on over at the SAP API Business Hub. Each API is well documented, and provides an OpenAPI definition for each API, complete with interactive documentation you can play within a sandbox environment. More than most API catalogs, marketplaces, and directories I profile have available. Allowing you to kick the tires and see what is going on, before working with the production version. They also provide you with a Java SDK to download for each API, something that could easily be expanded to support many different platforms, programming languages, and continuous integration cycles with solutions like APIMATIC. Making it more of a discovery, as well as integration marketplace.
Like any API marketplace effort, SAP needs to drum up activity within their catalog. They need more partners signing up to add their APIs, as well as consumers being made aware of the resources published there–something that takes a lot of work, evangelism, and storytelling. Next, I’m going to go through their partner signup and see what I can do to add some of my API resources there, and tell some stories about how they might be able to improve upon the partner flow. I like that their marketplace is OpenAPI driven. I’m curious about how much of the API publishing process is machine readable, allowing API providers to easily add their resources, without a lot of manual form work–something most are going to not have the time and resources for. I’ll keep evaluating how the SAP API Business Hub overlaps with my other API discovery work on the API Stack, the Streamdata.io API Gallery, Postman Network, and partnerships with APIs.guru, APIs.io, and others–continuing to push forward the API discovery conversation after almost 8 years.
About 60% of my work these days is building upon the last five years of my API Stack research, with a focus on building out the Streamdata.io API Gallery. We are fine tuning our approach for discovering new API-driven resources from across the landscape, while also profiling, quantifying, ranking, and publishing to the Streamdata.io API Gallery, The API Stack, and potentially other locations like the Postman Network, APIs.Guru, and other API discovery destinations I am working with. Helping us make sense of the increasingly noisy API landscape, while identifying the most valuable resources, and then profiling them to help reduce friction when it comes to potentially on-boarding and streaming data from each resource.
Discover New API-Driven Resources
Finding new APIs isn’t too difficult, you just have to Google for them. Finding new APIs in an automated way, with minimum human interaction becomes a little more difficult, but there are some proven ways to get the job done. There is no single place to go find new APIs, so I’ve refined a list of common place I use to discover new APIs:
- Search Engines - Using search engine APIs to look for APIs based upon the vocabulary we’ve developed.
- Github - Github provides a wealth of signals when it comes to APIs, and we use the Github API to discover interesting sources using our vocabulary.
- Stack Overflow - Using the Stack Exchange API, we are able to keep an eye out for developers talking about different types of interesting APIs.
- Twitter - The social network still provides some interesting signals when it comes to discussions about interesting APis.
- Reddit - There are many developers who still use Reddit to discuss technical topics, and ask questions about the APIs they are using.
Using the topic and entity vocabulary we’ve been developing, we can automate the discovery of new APIs across these sources using their APIs. Helping track on signals for the existing APIs we are keeping an eye on, but also quickly identify new APIs that we can add to the queue. Giving us the URL of companies, organizations, institutions, and government agencies who are doing interesting things with APIs.
Profile New Domains That Come In
Our API discovery engine produces a wealth of URLs for us to look at to understand the potential for new data, content, and algorithmic API resources. Our profiling process begins with a single URL, which we then use as the seed for a series of automated jobs that help us understand what an entity is all about:
- Description - Develop the most informative and concise description of what an entity does, including a set of rich meta tags.
- Developer - Identify where their developer and API program exists, for quantifying what they do.
- Blog - Find their blog, and supporting RSS feed so we can tune into what they are saying.
- Press - Also find their press section, and RSS feed so we can tune into the press about them.
- Twitter - Find their Twitter account so that we can tune into their social stream.
- LinkedIn - Find their LinkedIn account so that we can tune into their social stream.
- Github - Find their Github account so we can find more about what they are building.
- Contact - Establish a way to contact each entity, in case we have any questions or need support.
- Other - Identify other common building blocks like support, pricing, and terms of services that helps us understand what is going on.
The profiling process provides us with a framework to understand what an entity is all about, and where they fit into the bigger picture of the API landscape. Most of the sources of information we profile have some sort of machine readable component, allowing us to further quantify the entity, and better understand the value they bring to the table.
Quantify Each Entity
Next up we want to quantify each of the entities we’ve profiled, to give us a better understanding of the scope of their operations, and further define where they fit into the API landscape. We are looking for as much detail about what they are up to so we can know where we should be investing our time and energy reaching out and developing deeper relationships.
- API - We profile their APIs, generating an OpenAPI definition that describes the entire surface area of their APIs.
- Applications - Define approximately how many applications are running on an API, and how many developers are actively using.
- Blog - Pull all their blog posts, including the history, and actively pull on daily basis.
- Press - Pull all their press releases, including the history, and actively pull on daily basis.
- Twitter - Pull all their Tweets and mentions, including the history, and actively pull on daily basis.
- Github - Pull all their repos, stars, followers, and commit history, understand more about what they are building.
- Other - Pull other relevant signals from Reddit, Stack Overflow, AngelList, CrunchBase, SEC, Alexa Rank, ClearBit, and other important platform signals.
By pulling all the relevant signals for any entity we’ve profiled, we can better understand the scope of their operations, and assess the reach of their network. Helping us further quantity the value and opportunity that exists with each entity we are profiling, before we spend much more time on integrating.
Ranking Each Entity
After we’ve profiled and quantify an entity, we like to rank them, and put them into different buckets, so that we can prioritize which ones we reach out to, and which ones we invest more resources in monitoring, tracking, and integrating with. We currently rank them on a handful of criteria, using our own vocabulary and ranking formula.
- Provider Signals - Rank their activity and relevance based upon signals within their control.
- Community Signals - Rank their activity based upon signals the community generates about them.
- Analyst Signals - Rank their activity based upon signals from the analyst community.
- StreamRank - Rank the activity of their data, content, and API-driven resources.
- Topically - Understand the value of the activity based upon the topics that are available.
Our ranking of each entity gives us an overall score derived from several different dimensions. Helping us understand the scope, as well as the potential value for each set of APIs, allowing us to further prioritize which entities we invest more time and resources into, maximizing our efforts when it comes to deeper, more technical integrations, and streaming of data into any potential data lake.
Publishing APIs To The Gallery
Once an entity has been profiled, quantified, and ranked, we publish the profile to the gallery for discovery. Some of the more interesting APIs we hold back on a little bit, and share with partners and customers who are looking for interesting data sources via landscape analysis reports, but once we are ready we publish the entity to a handful of potential locations:
- Streamdata.io API Gallery - The distributed gallery owned and operated by Streamdata.io
- The API Stack - My own research area for profiling APIs that I’ve run for five years.
- APIs.guru - We are working on the best way to submit OpenAPI definitions to our friends here.
- Postman Network - For APIs that we validate, and generate working Postman Collections.
- APIs.io - Publishing to the machine readable API search engine for indexing.
- Other - We have a network of other aggregation, discovery, and related sites we are working with.
Because each entity is published to its own Github repository, with an APIs.json, OpenAPI, and Postman Collection defining its operations, once published, each entity becomes forkable. Making each gallery entry something anyone can fork, download and directly integrate into their existing systems and applications.
Keep Discovering, Profiling, Quantifying, and Publishing
This work is never ending. We’ll just keep discovery, profiling, quantifying, and publishing useful APIs to the gallery, and beyond. Since we benchmark APIs, we’ll be monitoring APIs that go away and we’ll archive them in the listings. We’ll also be actively quantifying each entity, by tuning into their blogs, press, Twitter, and Github accounts looking for interesting activity about what they are doing. Keeping our finger on the pulse of what each entity is up to, as well as what the scope and activity within their community is all about.
This project began as an API Evangelist project to understand how to keep up with the changing API space, and then evolved into a landscape analysis and lead generation tool for Streamdata.io, but now has become an engine for identifying valuable data and content resources. Providing a powerful discover engine for finding valuable data sources, but when combined with what Streamdata.io does, it also allows you to tune into the most important signals across all these entities being profiled, and stream the resulting data and signals into data lakes within your own existing cloud infrastructure, for use in training machine learning models, dashboards, and other relevant applications.
While profiling any company, a couple of the Google searches I will execute right away are for “[Company Name] Swagger” and “[Company Name] OpenAPI”, hoping that a provide is progressive enough to have published an OpenAPI definition–saving me hours of work understanding what their API does. I’ve added a third search to my toolbox, if these other two searches do not yield results, searching for “[Company Name] Postman”, revealing whether or not a company has published a Postman Collection for their API–another sign of a progressive, outward thinking API provider in my book.
A machine readable definition for an API tells me more about what a company, organization, institution, or government agency does, than anything else I can dig up on their website, or social media profiles. An OpenAPI definition or Postman Collection is a much more honest view of what an organization does, than the marketing blah blah that is often available on a website. Making machine readable definitions something I look for almost immediately, and prioritize profiling, reviewing, and understanding the entities I come across with a machine readable definition, over those that do not. I only have so much time in a day, and I will prioritize an entity with an OpenAPI or Postman, over those who do not.
The presence of an OpenAPI and / or Postman Collection isn’t just about believing in the tooling benefits these definitions provide. It is about API providers thinking externally about their API consumers. I’ve met a lot of API providers who are dismissive of these machine readable definitions as trends, which demonstrates they aren’t paying attention to the wider API space, and aren’t thinking about how they can make their API consumers lives easier–they are focused on doing what they do. In my experience these API programs tend to not grow as fast, focus on the needs of their integrators and consumers, and often get shut down after they don’t get the results they thought they’d see. APIs are all about having that outward focus, and the presence of OpenAPI and Postman Collection are a sign that a provider is looking outward.
While I’m heavily invested in OpenAPI (I am member), I’m also invested in Postman. More importantly, I’m invested in supporting well defined APIs that provide solutions to developers. When an API has an OpenAPI for delivering mocks, documentation, testing, monitoring, and other solutions, and they provide a Postman Collection that allows you to get up an running making API calls in seconds or minutes, instead of hours or days–it is an API I want to know more about. Making these potential searches the deciding factor between whether or not I will continue profiling and reviewing an API, or just flagging it for future consideration, and moving on to the next API in the queue. I can’t keep up with the number of APIs I have in my queue, and it is signals like this that help me prioritize my world, and get my work done on a regular basis.
I import and work with a number of OpenAPI definitions that I come across in the wild. When I come across a version 1.2, 2.0, 3.0 OpenAPI, I import them into my API monitoring system for publishing as part of my research. After the initial import of any OpenAPI definition, the first thing I look for is the consistent in the naming of paths, the availability of summary, descriptions, as well as tags. The naming conventions used is paths is all over the place, some are cleaner than others. Most have a summary, with fewer having descriptions, but I’d say about 80% of them do not have any tags available for each API path.
Tags for each API path are essential to labeling the value a resource delivers. I’m surprised that API providers don’t see the need for applying these tags. I’m guessing it is because they don’t have to work with many external APIs, and really haven’t put much thought into other people working with their OpenAPI definition beyond it just driving their own documentation. Many people still see OpenAPI as simply a driver of API documentation on their portal, and not as an API discovery, or complete lifecycle solution that is portable beyond their platform. Not considering how tags applied to each API resource will help others index, categorize, and organize APIs based upon the value in delivers.
I have a couple of algorithms that help me parse the path, summary, and description to generate tags for each path, but it is something I’d love for API providers to think more deeply about. It goes beyond just the resources available via each path, and the tags should reflect the overall value an API delivers. If it is a product, event, messaging, or other resource, I can extract a tag from the path, but the path doesn’t always provide a full picture, and I regularly find myself adding more tags to each API(if I have the time). This means that many of the APIs I’m profiling, and adding to my API Stack, API Gallery, and other work isn’t as complete with metadata as they possibly could be. Something API providers should be more aware of, and helping define as part of their hand crafting, or auto-generation of OpenAPI definitions.
Tag your APIs as part of your OpenAPI definitions! I know that many API providers are still auto-generating from a system, but once they have published the latest copy, make sure you load up in one of the leading API design tools, and give that last little bit of polish. Think of it as that last bit of API editorial workflow that ensures your API definitions speak to the widest possible audience, and are as coherent as it possibly can. Your API definitions tell a story about the resources you are making available, and the tags help provide a much more precise way to programmatically interpret what APIs actually deliver. Without them APIs might not properly show up in search engine and Github searches, or render coherently in other API services and tooling. OpenAPI tags are an essential part of defining and organizing your API resources–give them the attention they deserve.
A question I get regularly in my API workshops is, “how should we be organizing all of our microservices?” To which I always recommend they tune into what the API Academy team is up to, and then I dance around give a long winded answer about how hard it is for me to answer that. I think in response, I’m going to start asking for a complete org chart for their operations, list of all their database schema, and a list of all their clients and the industries they are operating in. It will still be a journey for them, or me to answer that question, but maybe this response will help them understand the scope of what they are asking.
I wish I could provide simple answers for folks when it came to how they should be naming, grouping, and organizing their microservices. I just don’t have enough knowledge about their organization, clients, and the domains in which they operate to provide a simple answer. It is another one of those API journeys an organization will have to embark on, and find their own way forward. It would take so much time for me to get to know an organization, its culture, resources, and how they are being put to use, I hesitate to even provide any advice, short of pointing them to what the API Academy team publishes books, and provides talks on. They are the only guidance I know that goes beyond the hyped of definition of microservices, and actually gets at the root of how you do it within specific domains, and tackle the cultural side of the conversation.
A significant portion of my workshops lately have been about helping groups think about delivering services using a consistent API lifecycle, and showing them the potential for API governance if they can achieve this consistency. Clearly I need to back up a bit, and address some of the prep work involved with making sure they have an organizational chart, all of the schema they can possibly bring to the table, existing architecture and services in play, as well as much detail on the clients, industries, and domains in which they operate. Most of my workshops I’m going in blind, not knowing who will all be there, but I think I need a section dedicated to the business side of doing microservices, before I ever begin talking about the technical details of delivering microservices, otherwise I will keep getting questions like this that I can’t answer.
Another area that is picking up momentum for me in these discussions is a symptom of of the lack of API discovery, and directly related to the other areas I just mentioned. You need to be able to deliver APIs along a lifecycle, but more importantly you need to be able to find the services, schema, and people behind them, as well as coherently speak to who will be consuming them. Without a comprehensive discovery, and the ability to understand all of these dependencies, organizations will never be able to find the success they desire with microservices. They won’t be any better than the monolithic way many organizations have been doing things to date, it will just be much more distributed complexity, which will achieve the same results as the monolithic systems that are in place today.
The topic of API discovery has been picking up momentum in 2018. It is something I’ve worked on for years, but with the number of microservices emerging out there, it is something I’m seeing become a concern amongst providers. It is also something I’m seeing more potential vendor chatter, looking to provide more services and tooling to help alleviate API discovery pain. Even with all this movement, there is still a lot of education and discussion to occur on the subject to help bring people up to speed on what is API discovery.
The most common view of what is API discovery, is when you need to find an API for developing an application. You have a need for a resource in your application, and you need to look across your internal and partner resources to find what you are looking for. Beyond that, you will need to search for publicly available API resources, using Google, Github, ProgrammableWeb, and other common ways to find popular APIs. This is definitely the most prominent perspective when it comes to API discovery, but it isn’t the only dimension of this problem. There are several dimensions to this stop along the API lifecycle, that I’d like to flesh out further, so that I can better articulate across conversations I am having.
Another area that gets lumped in with API discovery is the concept of service discovery, or how your APIs will find their backend services that they use to make the magic happen. Service discovery focuses on the initial discovery, connectivity, routing, and circuit breaker patterns involved with making sure an API is able to communicate with any service it depends on. With the growth of microservices there are a number of solutions like Consul that have emerged, and cloud providers like AWS are evolving their own service discovery mechanisms. Providing one dimension to the API discovery conversation, but different from, and often confused with front-end API discovery and how developers and applications find services.
One of the least discussed areas of API discovery, but is one that is picking up momentum, is finding APIs when you are developing APIs, to make sure you aren’t building something that has already been developed. I come across many organizations who have duplicate and overlapping APIs that do similar things due to lack of communication and a central directory of APIs. I’m getting asked by more groups regarding how they can be conducting API discovery by default across organizations, sniffing out APIs from log files, on Github, and other channels in use by existing development teams. Many groups just haven’t been good at documenting and communicating around what has been developed, as well as beginning new projects without seeing what already exists–something that will only become a great problem as the number of microservices grows.
The other dimension of API discovery I’m seeing emerge is discovery in the service of governance. Understand what APIs exist across teams so that definitions, schema, and other elements can be aggregated, measured, secured, and governed. EVERY organization I work with is unaware of all the data sources, web services, and APIs that exist across teams. Few want to admit it, but it is a reality. The reality is that you can’t govern or secure what you don’t know you have. Things get developed so rapidly, and baked into web, mobile, desktop, network, and device applications so regularly, that you just can’t see everything. Before companies, organizations, institutions, and government agencies are going to be able to govern anything, they are going to have begin addressing the API discovery problem that exists across their teams.
API discovery is a discipline that is well over a decade old. It is one I’ve been actively working on for over 5 years. It is something that is only now getting the discussion it needs, because it is a growing concern. It will be come a major concern with each passing day of the microservice evolution. People are jumping on the microservices bandwagon without any coherent way to organize schema, vocabulary, or API definitions. Let alone any strategy for indexing, cataloging, sharing, communicating, and registering services. I’m continuing my work on APIs.json, and the API Stack, as well as pushing forward my usage of OpenAPI, Postman, and AsyncAPI, which all contribute to API discovery. I’m going to continue thinking about how we can publish open source directories, catalogs, and search engines, and even some automated scanning of logs and other ways to conduct discovery in the background. Eventually, we will begin to find more solutions that work–it will just take time.
I wrote about Werner Vogel of Amazon’s post considering the impact of cloud regions a couple weeks back. I feel that his post captured an aspect of doing business in the cloud that isn’t discussed enough, and one that will continue to drive not just the business of APIs, but also increasingly the politics of APIs. Amidst increasing digital nationalism, and growing regulation of not just the pipes, but also platforms, understanding where your APIs are operating, and what networks you are using will become very important to doing business at a global scale.
It is an area I’m adding to my list of machine readable API definitions I’d like to add to the APIs.json stack. The goal with APIs.json is to provide a single index where we can link to all the essential building blocks of a APIs operations, with OpenAPI being the first URI, which provides a machine readable definition of the surface area of the APIs. Shortly after establishing the APIs.json specification, we also created API Commons, which is designed to be a machine readable specification for describing the licensing applied to an API, in response to the Oracle v Google API copyright case. Beyond that, there hasn’t been many other machine readable resources, beyond some existing API driven solutions used as part of API operations like Github and Twitter. There are other API definitions like Postman Collections and API Blueprint that I reference, but they are in the same silo as OpenAPI operates within.
Most of the resources we link to are still human-centered URLs like documentation, pricing, terms of service, support, and other essential building blocks of API operations. However, the goal is to evolve as many of these as possible towards being more machine readable. I’d like to see pricing, terms of services, and aspects of support become machine readable, allowing them to become more automated and understood not just at discovery, but also at runtime. I’m envisioning that regions should be added to this list of currently human readable building blocks that should eventually become machine readable. I feel like we are going to be needing to make runtime decisions regarding API regions, and we will need major cloud providers like Amazon, Azure, and Google to describe their regions in a machine readable way–something that API providers can reflect in their own API definitions, depending on which regions they operate in.
At the application and client level, we are going to need to be able to quantify, articulate, and switch between different regions depending on the user, type of resources being consumed, and business being conducted. While this can continue being manual for a while, at some point we are going to need it to become machine readable so it can become part of the API discovery, as well as integration layers. I do not know what this machine readable schema will look like, but I’m sure it will be defined based upon what AWS, Azure, and Google are already up to. However, it will quickly need to become a standard that is owned by some governing body, and overseen by the community and not just vendors. I just wanted to plant the seed, and it is something I’m hoping will grow over time, but I’m sure it will take 5-10 years before something takes roots, based upon my experience with OpenAPI, APIs.json, and API Commons.
You’ve heard of OpenAPI, right? It is the API specification for defining the surface area of your web API, and the schema you employ–making your public API more discoverable, and consumable in a variety of tools services. OpenAPI is the API definition for documenting your API when you are just getting started with your platform, and you are looking to maximize the availability and access of your platform API(s). After you’ve acquired all the users, content, investment, and other value, ClosedAPI is the format you will want to switch to, abandoning OpenAPI, for something a little more discreet.
Collect As Much Data As You Possibly Can
Early on you wanted to be defining the schema for your platform using OpenAPI, and even offering up a GraphQL layer, allowing your data model to rapidly scale, adding as may data points as you possible can. You really want to just ingest any data you can get your hands on the browser, mobile phones, and any other devices you come into contact with. You can just dump it all into big data lake, and sort it out later. Adding to your platform schema when possible, and continuing to establish new data points that can be used in advertising and targeting of your platform users.
Turn The Firehose On To Drive Activity
Early on you wanted your APIs to be 100% open. You’ve provided a firehose to partners. You’ve made your garden hose free to EVERYONE. OpenAPI was all about providing scalable access to as many users as you can through streaming APIs, as well as lower volume transactional APIs you offer. Don’t rate limit too heavily. Just keep the APIs operating at full capacity, generating data and value for the platform. ClosedAPI is for defining your API as you begin to turn off this firehose, and begin restricting access to your garden hose APIs. You’ve built up the capacity of the platform, you really don’t need your digital sharecroppers anymore. They were necessary early on in your business, but they are not longer needed when it comes to squeezing as much revenue as you can from your platform.
The ClosedAPI Specification
We’ve kept the specification as simple as possible. Allowing you to still say you have API(s), but also helping make sure you do not disclose too much about what you actually have going on. Providing you the following fields to describe your APIs:
That is it. You can still have hundreds of APIs. Issue press releases. Everyone will just have to email you to get access to your APIs. It is up to you to decide who actually gets access to your APIs, which emails you respond, or if the email account is ever even checked in the first place. The objective is just to appear that you have APIs, and will entertain requests to access them.
Maintain Control Over Your Platform
You’ve worked hard to get your platform to where it is. Well, not really, but you’ve worked hard to ensure that others do the work for you. You’ve managed to convince a bunch of developers to work for free building out the applications and features of your platform. You’ve managed to get the users of those applications to populate your platform with a wealth of data, making your platform exponentially more valuable that you could have done on your own. Now that you’ve achieved your vision, and people are increasingly using your APIs to extract value that belongs to you, you need to turn off the fire hose, garden hose, and kill off applications that you do not directly control.
The ClosedAPI specification will allow you to still say that you have APIs, but no longer have to actually be responsible for your APIs being publicly available. Now all you have to do is worry about generating as much revenue as you possibly can from the data you have. You might lose some of your users because you do not have publicly available APIs anymore, as well as losing some of your applications, but that is ok. Most of your users are now trapped, locked-in, and dependent on your platform–continuing to generate data, content, and value for your platform. Stay in tune with the specification using the road map below.
- Remove Description – The description field seems extraneous.
I’ve talked about how generating an OpenAPI (fka Swagger) definition from code is still the dominate way that microservice owners are producing this artifact. This is a by-product of developers seeing it as just another JSON artifact in the pipeline, and from it being primarily used to create API documentation, often times using Swagger UI–which is also why it is still called Swagger, and not OpenAPI. I’m continuing my campaign to help the projects I’m consulting on be more successful with their overall microservices strategy by helping them better understand how they can work in concert by focus in on OpenAPI, and realizing that it is the central contract for their service.
Each Service Beings With An OpenAPI Contract There is no reason that microservices should start with writing code. It is expensive, rigid, and time consuming. The contract that a service provides to clients can be hammered out using OpenAPI, and made available to consumers as a machine readable artifact (JSON or YAML), as well as visualized using documentation like Swagger UI, Redocs, and other open source tooling. This means that teams need to put down their IDE’s, and begin either handwriting their OpenAPI definitions, or being using an open source editor like Swagger Editor, Apicurio, API GUI, or even within the Postman development environment. The entire surface area of a service can be defined using OpenAPI, and then provided using mocked version of the service, with documentation for usage by UI and other application developers. All before code has to be written, making microservices development much more agile, flexible, iterative, and cost effective.
Mocking Of Each Microservice To Hammer Out Contract Each OpenAPI can be used to generate a mock representation of the service using Postman, Stoplight.io, or other OpenAPI-driven mocking solution. There are a number of services, and tooling available that takes an OpenAPI, an generates a mock API, as well as the resulting data. Each service should have the ability to be deployed locally as a mock service by any stakeholder, published and shared with other team members as a mock service, and shared as a demonstration of what the service does, or will do. Mock representations of services will minimize builds, the writing of code, and refactoring to accommodate rapid changes during the API development process. Code shouldn’t be generated or crafted until the surface area of an API has been worked out, and reflects the contract that each service will provide.
OpenAPI Documentation Always AVailable In Repository Each microservice should be self-contained, and always documented. Swagger UI, Redoc, and other API documentation generated from OpenAPI has changed how we deliver API documentation. OpenAPI generated documentation should be available by default within each service’s repository, linked from the README, and readily available for running using static website solutions like Github Pages, or available running locally through the localhost. API documentation isn’t just for the microservices owner / steward to use, it is meant for other stakeholders, and potential consumers. API documentation for a service should be always on, always available, and not something that needs to be generated, built, or deployed. API documentation is a default tool that should be present for EVERY microservice, and treated as a first class citizen as part of its evolution.
Bringing An API To Life Using It’s OpenAPI Contract Once an OpenAPI contract has been been defined, designed, and iterated upon by service owner / steward, as well as a handful of potential consumers and clients, it is ready for development. A finished (enough) OpenAPI can be used to generate server side code using a popular language framework, build out as part of an API gateway solution, or common proxy services and tooling. In some cases the resulting build will be a finished API ready for use, but most of the time it will take some further connecting, refinement, and polishing before it is a production ready API. Regardless, there is no reason for an API to be developed, generated, or built, until the OpenAPI contract is ready, providing the required business value each microservice is being designed to deliver. Writing code, when an API will change is an inefficient use of time, in a virtualized API design lifecycle.
OpenAPI-Driven Monitoring, Testing, and Performance A read-to-go OpenAPI contract can be used to generate API tests, monitors, and deliver performance tests to ensure that services are meeting their business service level agreements. The details of the OpenAPI contract become the assertions of each test, which can be executed against an API on a regular basis to measure not just the overall availability of an API, but whether or not it is actually meeting specific, granular business use cases articulated within the OpenAPI contract. Every detail of the OpenAPI becomes the contract for ensuring each microservice is doing what has been promised, and something that can be articulated and shared with humans via documentation, as well as programmatically by other systems, services, and tooling employed to monitor and test accordingly to a wider strategy.
Empowering Security To Be Directed By The OpenAPI Contract An OpenAPI provides the entire details of the surface area of an API. In addition to being used to generate tests, monitors, and performance checks, it can be used to inform security scanning, fuzzing, and other vital security practices. There are a growing number of services and tooling emerging that allow for building models, policies, and executing security audits based upon OpenAPI contracts. Taking the paths, parameters, definitions, security, and authentication and using them as actionable details for ensuring security across not just an individual service, but potentially hundreds, or thousands of services being developed across many different teams. OpenAPI quickly is becoming not just the technical and business contract, but also the political contract for how you do business on web in a secure way.
OpenAPI Provides API Discovery By Default An OpenAPI describes the entire surface area for the request and response of each API, providing 100% coverage for all interfaces a services will possess. While this OpenAPI definition will be generated mocks, code, documentation, testing, monitoring, security, and serving other stops along the lifecycle, it provides much needed discovery across groups, and by consumers. Anytime a new application is being developed, teams can search across the team Github, Gitlab, Bitbucket, or Team Foundation Server (TFS), and see what services already exist before they begin planning any new services. Service catalogs, directories, search engines, and other discovery mechanisms can use OpenAPIs across services to index, and make them available to other systems, applications, and most importantly to other humans who are looking for services that will help them solve problems.
OpenAPI Deliver The Integration Contract For Client OpenAPI definitions can be imported in Postman, Stoplight, and other API design, development, and client tooling, allowing for quick setup of environment, and collaborating with integration across teams. OpenAPIs are also used to generate SDKs, and deploy them using existing continuous integration (CI) pipelines by companies like APIMATIC. OpenAPIs deliver the client contract we need to just learn about an API, get to work developing a new web or mobile application, or manage updates and version changes as part of our existing CI pipelines. OpenAPIs deliver the integration contract needed for all levels of clients, helping teams go from discovery to integration with as little friction as possible. Without this contract in place, on-boarding with one service is time consuming, and doing it across tens, or hundreds of services becomes impossible.
OpenAPI Delivers Governance At Scale Across Teams Delivering consistent APIs within a single team takes discipline. Delivering consistent APIs across many teams takes governance. OpenAPI provides the building blocks to ensure APIs are defined, designed, mocked, deployed, documented, tested, monitored, perform, secured, discovered, and integrated with consistently. The OpenAPI contract is an artifact that governs every stop along the lifecycle, and then at scale becomes the measure for how well each service is delivering at scale across not just tens, but hundreds, or thousands of services, spread across many groups. Without the OpenAPI contract API governance is non-existent, and at best extremely cumbersome. The OpenAPI contract is not just top down governance telling what they should be doing, it is the bottom up contract for service owners / stewards who are delivering the quality services on the ground inform governance, and leading efforts across many teams.
I can’t articulate the importance of the OpenAPI contract to each microservice, as well as the overall organizational and project microservice strategy. I know that many folks will dismiss the role that OpenAPI plays, but look at the list of members who govern the specification. Consider that Amazon, Google, and Azure ALL have baked OpenAPI into their microservice delivery services and tooling. OpenAPI isn’t a WSDL. An OpenAPI contract is how you will articulate what your microservice will do from inception to deprecation. Make it a priority, don’t treat it as just an output from your legacy way of producing code. Roll up your sleeves, and spend time editing it by hand, and loading it into 3rd party services to see the contract for your microservice in different ways, through different lenses. Eventually you will begin to see it is much more than just another JSON artifact laying around in your repository.
I’m working on a healthcare related microservice project, and I’m looking for a way to help developers express their service dependencies within the OpenAPI or some other artifact. At this point I’m feeling like the OpenAPI is the place to articulate this, adding a vendor extension to the specification that can allow for the referencing of one or more other services any particular service is dependent on. Helping make service discovery more machine readable at discovery and runtime.
To help not reinvent the wheel, I am looking at using the Schema.org Web API type including the extensions put forth by Mike Ralphson and team. I’d like the x-api-dependencies collection to adopt a standardized schema, that was flexible enough to reference different types of other services. I’d like to see the following elements be present for each dependency:
- versions (OPTIONAL array of thing -> Property -> softwareVersion). It is RECOMMENDED that APIs be versioned using [semver]
- entryPoints (OPTIONAL array of Thing -> Intangible -> EntryPoint)
- license (OPTIONAL, CreativeWork or URL) - the license for the design/signature of the API
- transport (enumerated Text: HTTP, HTTPS, SMTP, MQTT, WS, WSS etc)</p>
- apiProtocol (OPTIONAL, enumerated Text: SOAP, GraphQL, gRPC, Hydra, JSON API, XML-RPC, JSON-RPC etc)
- webApiDefinitions (OPTIONAL array of EntryPoints) containing links to machine-readable API definitions
- webApiActions (OPTIONAL array of potential Actions)
Using the Schema.org Web type would allow for a pretty robust way to reference dependencies between services in a machine readable way, that can be indexed, and even visualized in services and tooling. When it comes to evolving and moving forward services, having dependency details baked in by default make a lot of sense, and ideally each dependency definition would have all the details of the dependency, as well as potential contact information, to make sure everyone is connected regarding the service road map. Anytime a service is being deprecated, versioned, or impacted in any way, we have all the dependencies needed to make an educated decision regarding how to progress with the least amount of friction as possible.
I’m going to go ahead and create a draft OpenAPI vendor extension specification for x-service-dependencies, and use the Schema.org WebAPI type, complete with the added extensions. Once I start using it, and have successfully implemented it for a handful of services I will publish and share a working example. I’m also on the hunt for other examples of how teams are doing this. I’m not looking for code dependency management solutions, I am specifically looking for API dependency management solutions, and how teams are making these dependencies discoverable in a machine readable way. If you know of any interesting approaches, please let me know, I’d like to hear more about it.
I just finished a narrative around my API Stack profiling, telling the entire story around the profiling of APIs for inclusion in the stack. To help encourage folks to get involved, I wanted to help distill down the process into a single checklist that could be implemented by anyone.
The Github Base Everything begins as a Github repository, and it can existing in any user or organization. Once ready, I can fork and publish as part of the API stack, or sync with an existing repository project.
- Create Repo - Create a single repository with the name of the API provider in plain language.
- Create README - Add a README for the project, articulating who the target is and the author.
OpenAPI Definition Profiling the API surface area using OpenAPI, providing a definition of the request and response structure for all APIs. Head over to their repository if you need to learn more about OpenAPI. Ideally, there is an existing OpenAPI you can start with, or other machine readable definition you can use as base–look around within their developer portal, because sometimes you can find an existing definition to start with. Next look on Github, as you never know where there might be something existing that will save you time an energy. However you approach, I’m looking for complete details on the following:
- info - Provide as much information about the API.
- host - Provide a host, or variables describing host.
- basePath - Document the basePath for the API.
- schemes - Provide any schemes that the API uses.
- produces - Document which media types the API uses.
- paths - Detail the paths including methods, parameters, enums, responses, and tags.
- definitions - Provide schema definitions used in all requests and responses.
To help accomplish this, I often will scrape, and use any existing artifacts I can possible find. Then you just have to roll up your sleeves and begin copying and pasting from the existing API documentation, until you have a complete definition. There is never any definitive way to make sure you’ve profiled the entire API, but do your best to profile what is available, including all the detail the provider as shared. There will always be more that we can do later, as the API gets used more, and integrated by more providers and consumers.
Postman Collection Once you have an OpenAPI definition available for the API, import it into Postman. Make sure you have a key, and the relevant authentication environment settings you need. Then begin making API calls to each individual API path, making sure your API definition is as complete as it possibly can. This can be the quickest, or the most time consuming of the profiling, depending on the complexity and design of the API. The goal is to certify each API path, and make sure it truly reflects what has been documented. Once you are done, export a Postman Collection for the API, complimenting the existing OpenAPI, but with a more run-time ready API definiton.
Merging the Two Definitions Depending on how many changes occurred within the Postman portion of the profiling you will have to sync things up with the OpenAPI. Sometimes it is a matter of making minor adjustments, sometimes you are better off generating an entirely new OpenAPI from the Postman Collection using APIMATIC’s API Transformer. The goal is to make sure the OpenAPI and Postman are in sync, and working the same way as expected. Once they are in sync they can uploaded to the Github repository for the project.
Managing the Unkown Unknowns There will be a lot of unknowns along the way. A lot of compromises, and shortcuts that can be taken. Not every definition will be perfect, and sometimes it will require making multiple definitions because of the way an API provider has designed their API and used multiple subdomains. Document it all as Github issues in the repository. Use the Github issues for each API as the journal for what happened, and where you document any open questions, or unfinished dwork. Making the repository the central truth for the API definition, as well as the conversation around the profiling process.
Providing Central APIs.json Index The final step of the process is to create an APIs.json index for the API. You can find a sample one over at the specification website. When I profile an API using APIs.json I am always looking for as much detail as I possibly can, but for the purposes of API Stack profiling, I’m looking for these essential building blocks:
- Website - The primary website for an entity owning the API.
- Portal - The URL to the developer portal for an API.
- Documentation - The direct link to the API documentation.
- OpenAPI - The link to the OpenAPI I created on Github.
- Postman - The link to the Postman Collection I created on Github.
- Sign Up - Where do you sign up for an API.
- Pricing - A link to the plans, tiers, and pricing for an API.
- Terms of Service - A URL to the terms of service for an API.
- Twitter - The Twitter account for the API provider – ideally, API specific.
- Github - The Github account or organization for the API provider.
If you create multiple OpenAPIs, and Postman Collections, you can add an entry for each API. If you break a larger API provider into several entity provider repositories, you can link them together using the include property of the APIs.json file. I know the name of the specification is JSON, but feel free to do them in YAML if you feel more comfortable–I do. ;-) The goal of the APIs.json is to provide a complete profile of the API operations, going beyond what OpenAPI and Postman Collections deliver.
Including In The API Stack You should keep all your work in your own Github organization or user account. Once you’ve created a repository you would like to include in the API Stack, and syndicate the work to the Streamdata.io API Gallery, APIs.io, APIs.guru, Postman Network, and others, then just submit as a Github issue on the main repository. I’m still working on the details of how to keep repositories in sync with contributors, then reward and recognize them for work their work, but for now I’m relying on Github to track all contributions, and we’ll figure this part out later. The API Stack is just the workbench for all of this, and I’m using it as a place to merge the work of many partners, from many sources, and then work to sensibly syndicate out validated API profiles to all the partner galleries, directories, and search engines.
I’m thinking a lot about what is needed at API runtime lately. How we document and provide machine readable definitions for APIs, and how we provide authentication, pricing, and even terms of service to help reduce friction. As Mike Amundsen (@mamund) puts it, to enable “find and bind”. This goes beyond simple API on-boarding, and getting started pages, and looks to make APIs executable within a single click, allowing us to put them to use as soon as we find them.
The most real world example of this in action can be found with the Run in Postman button, which won’t always deal with the business and politics of APIs at runtime, but will deal with just about everything else. Allowing API providers to publish Run in Postman Buttons, defined using a Postman Collection, which include authentication environment details, that API consumers can use to quickly fire up an API in a single click. One characteristic I’ve come across that contributes to Postman Collections being truly executable is that they reflect the small unit possible for use at API runtime.
You can see an example of this in action over at Peachtree Data, who like many other API providers have crafted Run in Postman buttons, but instead of doing this for the entire surface area of their API, they have done it for a single API path. Making the Run in Postman button much more precise, and executable. Taking it beyond just documentation, to actually being more of a API runtime executable artifact. This is a simple shift in how Postman Collections can be used, but a pretty significant one. Now instead of wading through all of Peachtree’s APIs in my Postman, I can just do an address cleanse, zip code lookup, or email validation–getting down to business in a single click.
This is an important aspect of on-boarding developers. I may not care about wading through and learning about all your APIs right now. I’m just looking for the API solution I need to a particular problem. Why clutter up my journey with a whole bunch of other resources? Just give me what I need, and get out of my way. Most other API providers I have looked at in Postman’s API Network have provided a single Run in Postman button for all of their APIs, where Peachtree has opted to provide many Run in Postman buttons for each of their APIs. Distinguishing themselves, and the value of each of their API resources in a pretty significant way.
I asked the questions the other week, regarding how big or how small is an API? I’m struggling with this question in my API stack work, as part of an investment by Streamdata.io to develop an API gallery. Do people want to find Amazon Web Services APIs? Amazon EC2 APIs? Or the single path for firing up an instance of EC2? What is the small unit of compute we should be documenting, generating OpenAPI and Postman Collections for? I feel like this is an important API discovery conversation to be having. I think depending on the circumstances, the answer will be different. It is a question I’ll keep asking in different scenarios, to help me better understand how I can document, publish, and make APIs not just more discoverable, but usable at runtime.
The Postman API Network is one of the recent movements in the API discovery space I’ve been working to get around to covering. As Postman continues its expansion from being just an API client, to a full lifecycle API development solution, they’ve added a network for discovering existing APIs that you can begin using within Postman in a single click. Postman Collections make it ridiculously easy to get up and running with an API. So easy, I’m confounded why ALL APIs aren’t publishing Postman Collections with Run in Postman Buttons published in their API docs.
The Postman API Network provides a catalog of APIs in over ten categories, with links to each API’s documentation. All of the APIs in the network have a Run in Postman button available as part of their documentation, which includes them in the Postman API Network. It is a pretty sensible approach to building a network of valuable APIs, who all have invested in there being a runtime-ready, machine readable Postman Collection for their APIs. One of the more interesting approaches I’ve seen introduced to help solve the API discovery problem in the eight years I’ve been doing API Evangelist.
I’ve been talking to Abhinav Asthana (@a85) about the Postman API Network, and working to understand how I can contribute, and help grow the catalog as part of my work as the API Evangelist. I’m a fan of Postman, and an advocate of it as an API lifecycle development solution, but I’m also really keen on bringing comprehensive API discovery solutions to the table. With the Postman API Network, and other API discovery solutions I’m seeing emerge recently, I’m finding renewed energy for this area of my work. Something I’ll be brainstorming and writing about more frequently in coming months.
Streamdata.io has been investing in me moving forward the API discovery conversation, to build out their vision of a Streamdata.io API Gallery, but also to contribute to the overall API discovery conversation. I’m in the middle of understanding how this aligns with my existing API Stack work, APIs.json and APIs.io effort, as well as with APIs.guru, AnyAPI, and the wider OpenAPI Initiative. If you have thoughts you’d like to share, feel free to ping me, and I’m happy to talk more about the API discovery, network, and run-time work I’m contributing to, and better understand how your work fits into the picture.
I’m putting some thought into the Schema.Org WebAPI Type Extension proposal by Mike Ralphson (Mermade Software) and Ivan Goncharov (APIs.guru), to “facilitate better automatic discovery of WebAPIs and associated machine and human-readable documentation”. It’s an interesting evolution in how we define APIs, in terms of API discovery, but I would also add potentially at “execute time”.
Here is what a base WebAPI type schema could look like:
"name": "Google Knowledge Graph Search API",
"description": "The Knowledge Graph Search API lets you find entities in the Google Knowledge Graph. The API uses standard schema.org types and is compliant with the JSON-LD specification.",
"name": "Google Inc."
Then the proposed extensions could include the following:
- versions (OPTIONAL array of thing -> Property -> softwareVersion). It is RECOMMENDED that APIs be versioned using [semver]
- entryPoints (OPTIONAL array of Thing -> Intangible -> EntryPoint)
- license (OPTIONAL, CreativeWork or URL) - the license for the design/signature of the API
- transport (enumerated Text: HTTP, HTTPS, SMTP, MQTT, WS, WSS etc)</p>
- apiProtocol (OPTIONAL, enumerated Text: SOAP, GraphQL, gRPC, Hydra, JSON API, XML-RPC, JSON-RPC etc)
- webApiDefinitions (OPTIONAL array of EntryPoints) containing links to machine-readable API definitions
- webApiActions (OPTIONAL array of potential Actions)
The webApiDefinitions (EntryPoint) contentType property contains a reference to one of the following conten types:
- OpenAPI / Swagger in JSON - application/openapi+json or application/x-openapi+json
- OpenAPI / Swagger in YAML - application/openapi
- RAML - application/raml+yaml
- API Blueprint in markdown - text/vnd.apiblueprint
- API Blueprint parsed in JSON - application/vnd.refract.parse-result+json
- API Blueprint parsed in YAML - application/vnd.refract.parse-result+yaml
Then the webApiActions property brings a handful of actions to the table, with the following being suggested:
- apiAuthentication - Links to a resource detailing authentication requirements. Note this is a human-readable resource, not an authentication endpoint
- apiClientRegistration - Links to a resource where a client may register to use the API
- apiConsole - Links to an interactive console where API calls may be tested
- apiPayment - Links to a resource detailing pricing details of the API
I fully support extending the Schema.org WebAPI vocabulary in this way. It adds all the bindings needed to make the WebAPI type executable at runtime, as well as it states at discovery time. I like the transport and protocol additions, helping ensure the WebAPI vocabulary is as robust as it possibly can. webApiDefinitions provides all the technical details regarding the surface area of the API we need to actually engage with it at runtime, and webApiActions begins to get at some of the business of APIs friction that exists at runtime. Making for an interesting vocabulary that can be used to describe web APIs, which also becomes more actionable by providing everything you need to get up and running.
The suggestions are well thought out and complete. If I was to add any elements, I’d say it also needs a support link. There will be contact information embedded within the API definitions, but having a direct link along with registration, documentation, terms of service, authentication, and payment would help out significantly. I would say that the content type to transport and protocol coverage is deficient a bit. Meaning you have SOAP, but not referencing WSDL. I know that there isn’t a direct definition covering every transport and protocol, but eventually it should be as comprehensive as it can. (ie. adding AsyncAPI, etc. in the future)
The WebAPI type extensions reflect what we have been trying to push forward with our APIs.json work, but comes at it from a different direction. I feel there are significant benefits to having all these details as part of the Schema.org vocabulary, expanding on what you can describe in a common way. Which can then also be used as part of each APIs requests, responses, and messages. I don’t see APIs.json as part of a formal vocabulary like this–I see it more as the agile format for indexing APIs that exist, and building versatile collections of APIs which could also contain a WebAPI reference.
I wish I had more constructive criticism or feedback, but I think it is a great first draft of suggestions for evolving the WebAPI type. There are other webApiActions properties I’d like to see based upon my APIs.json work, but I think this represents an excellent first step. There will be some fuzziness between documentation and apiConsole, as well as gaps in actionability between apiAuthentication, and apiClientRegistration–thinks like application creation (to get keys), and opportunities to have Github, Twitter, and other OpenID/OAuth authentication, but these things can be worked out down the road. Sadly there isn’t much standardization at this layer currently, and I see this extension as a first start towards making this happen. As I said, this is a good start, and we have lots of work ahead as we see more adoption.
Nice work Mike and Ivan! Let me know how I can continue to amplify and get the word out. We need to help make sure folks are describing their APIs using Schema.org. I’d love to be able to automate the discovery of APIs, using existing search engines and tooling–I know that you two would like to see this as well. API discovery is a huge problem, which there hasn’t been much movement on in the last decade, and having a common vocabulary that API providers can use to describe their APIs, which search engines can tune into would help move us further down the road when it comes to having more robust API discovery.
I’ve been breaking down the work on banking APIs coming out of Open Banking in the UK lately. I recently took all their OpenAPI definitions and published as a demo API developer portal. Bringing the definitions out of the shadows a little bit, and showing was is possible with the specification. Pushing the project forward some more today I published the Open Banking API Directory specification to the project, showing the surface area of the very interesting, and important component of open banking APIs in the UK.
The Open Banking Directory provides a pretty complete, albeit rough and technical approach to delivering observability for the UK banking industry API ecosystem actor layer. Everyone involved in the banking API ecosystem in UK has to be registered in the directory. It provides profiles of the banks, as well as any third party players. It really provides an unprecedented, industry level look at how you can make API ecosystems more transparent and observable. This thing doesn’t exist at the startup level because nobody wants to be open with the number of developers, or much else regarding the operation of their APIs. Making any single, or industry level API ecosystem, operate as black boxes–even if they claim to be an “open API”.
Could you imagine if API providers didn’t handle their own API management layer, and an industry level organization would handle the registration, certification, directory, and dispute resolution between API providers and API consumers? Could you imagine if we could see the entire directory of Facebook and Twitter developers, understand what businesses and individuals were behind the bots and other applications? Imagine if API providers couldn’t lie about the number of active developers, and we knew how many different APIs each application developers used? And it was all public data? An entirely different API landscape would exist, with entirely different incentive models around providing and consuming gAPIs.
The Open Banking Directory is an interesting precedent. It’s not just an observable API authentication and management layer. It also is an API. Making the whole thing something that can be baked into the industry level, as well as each individual application. I’m going to have to simmer on this concept some more. I’ve thought a lot about collective API developer and client solutions, but never anything like this. I’m curious to see how this plays out in a heavily regulated country and industry, but also eager to think about how something like this might work (or not) in government API circles, or even in the private sector, within smaller, less regulated industries.
I had breakfast with Mike Amundsen (@mamund) and Matt McLarty (@MattMcLartyBC) of the CA API Academy team this morning in midtown this morning. As we were sharing stories of what each other was working on, the topic of what is needed to execute an API call came up. Not the time consuming find an API, sign up for an account, figure out the terms of service and pricing version, but all of this condensed into something that can happen in a split second within applications and systems.
How do we distill down the essential ingredients of API consumption into a single, machine readable unit that can be automated into what Mike Amundsen calls, “find and bind”. This is something I’ve been thinking a lot about lately as I work on my API discovery research, and there are a handful of elements that need to be present:
- Authentication - Having keys to be able to authentication.
- Surface Area - What is the host, base url, path, headers, and parameters for a request.
- Terms of Service - What are the legal terms of service for consumption.
- Pricing - How much does each API request cost me?
We need these elements to be machine readable and easily accessible at discover and runtime. Currently the surface area of the API can be described using OpenAPI, that isn’t a problem. The authentication details can be included in this, but it means you already have to have an application setup, with keys. It doesn’t include new users into the equation, meaning, discovering, registering, and obtaining keys. I have a draft specification I call “API plans” for the pricing portion of it, but it is something that still needs a lot of work. So, in short, we are nowhere near having this layer ready for automation–which we will need to scale all of this API stuff.
This is all stuff I’ve been beating a drum about for years, and I anticipate it is a drum I’ll be beating for a number of more years before we see come into focus. I’m eager to see Mike’s prototype on “find and bind”, because it is the only automated, runtime, discovery, registration, and execute research I’ve come across that isn’t some proprietary magic. I’m going to be investing more cycles into my API plans research, as well as the terms of service stuff I started way back when alongside my API Commons project. Hopefully, moving all this forward another inch or two.
As I prepare to launch the Streamdata.io API Gallery, I am doing a handful of presentations to partners. As part of this process I am looking to distill down the objectives behind the gallery, and the opportunity it delivers to just a handful of talking points I can include in a single slide deck. Of course, as the API Evangelist, the way I do this is by crafting a story here on the blog. To help me frame the conversation, and get to the core of what I needed to present, I wanted to just ask a couple questions, so that I can answer them in my presentation.
What is the Streamdata.io API Gallery? It is a machine readable, continuously deployed collection of OpenAPI definitions, indexed used APIs.json, with a user friendly user interface which allows for the browsing, searching, and filtering of individual APIs that deliver value within specific industries and topical areas.
What are we looking to accomplish with the Streamdata.io API Gallery? Discover and map out interesting and valuable API resources, then quantify what value they bring to the table while also ranking, categorizing, and making them available in a search engine friendly way that allows potential Streamdata.io customers to discover and understand what is possible.
What is the opportunity around the Streamdata.io API Gallery? Identify the best of breed APIs out there, and break down the topics that they deliver within, while also quantifying the publish and subscribe opportunities available–mapping out the event-driven opportunity that has already begun to emerge, while demonstrating Streamdata.io’s role in helping get existing API providers from where they are today, to where they need to be tomorrow.
Why is this relevant to Streamdata.io, and their road map? It provides a wealth of research that Streamdata.io can use to understand the API landscape, and feed it’s own sales and marketing strategy, but doing it in a way that generates valuable search engine and social media exhaust which potential customers might possibly find interesting, bringing them new API consumers, while also opening their eyes up to the event-driven opportunity that exists out there.
Distilling Things Down A Bit More Ok, that answers the general questions about what the Streamdata.io API Gallery is, and why we are building it. Now I want to distill down a little bit more to help me articulate the gallery as part of a series of presentations, existing as just a handful of bullet points. Helping get the point across in hopefully 60 seconds or less.
- What is the Streamdata.io API Gallery?
- API directory, for finding individual units of compute within specific topics.
- OpenAPI (fka Swagger) driven, making each unit of value usable at run-time.
- APIs.json indexed, making the collections of resources easy to search and use.
- Github hosted, making it forkable and continuously deployable and integrate(able).
- Why is the Streamdata.io Gallery relevant?
- It maps out the API universe with an emphasis on the value each individual API path possesses.
- Categories, tags, and indexes APIs into collections which are published to Github.
- Provides a human and machine friendly view of the existing publish and subscribe landscape.
- Begins to organize the API universe in context of a real time event-driven messaging world.
- What is the opportunity around the Streamdata.io API Gallery?
- Redefining the API landscape from an event-driven perspective.
- Quantify, qualify, and rank APIs to understand what is the most interesting and highest quality.
- Help API providers realize events occurring via their existing platforms.
- Begin moving beyond a request and response model to an event-driven reality.
There is definitely a lot more going on within the Streamdata.io API Gallery, but I think this captures the essence of what we are trying to achieve. A lot of what we’ve done is building upon my existing API Stack work, where I have worked to profile and index public APIs using OpenAPI and APIs.json, but this round of work is taking things to a new level. With API Stack I ended up with lists of companies and organizations, each possessing a list of APIs. The Streamdata.io API Gallery is a list of API resources, broken down by the unit of value they bring to the table, which is further defined by whether it is a GET, POST, or PUT–essentially a publish or subscribe opportunity.
Additionally, I am finally finding traction with the API rating system(s) I have been developing for the last five years. Profiling and measuring the companies behind the APIs I’m profiling, and making this knowledge available not just at discover time, but potentially at event and run time. Basically being able to understand the value of an event when it happens in real time, and be able to make programmatic decisions regarding whether we care about the particular event or not. Eventually, allowing us to subscribe only to the events that truly matter to us, and are of the highest value–then tuning out the rest. Delivering API ratings in an increasingly crowded and noisy event-driven API landscape.
We have the prototype for the Streamdata.io API Gallery ready to go. We are still adding APIs, and refining how they are tagged and organized. The rating system is very basic right now, but we will be lighting up different dimensions of the rating(s) algorithm, and hopefully delivering on different angles of how we quantify the value of events that occurring. I’m guessing we will be doing a soft launch in the next couple of weeks to little fanfare, and it will be something that builds, and evolves over time as the API index gets refined and used more heavily.
I am creating a lot of OpenAPI definitions right now. Streamdata.io is investing in me pushing forward my API Stack work, where I profile API using OpenAPI, and index their operations using APIs.json. From the resulting indexes, we are building out the Streamdata.io API Gallery, which shows the possibilities of providing streaming APIs on top of existing web APIs available across the landscape. The OpenAPI definitions I’m creating aren’t 100% complete, but they are “good enough” for what we are needing to do with them, and are allowing me to catalog a variety of interesting APIs, and automate the proxying of them using Streamdata.io.
I’m finding the most important part of doing this work is making sure there is a rich summary, description, and set of tags for each API. While the actual path, parameters, and security definitions are crucial to programmatically executing the API, the summary, description, and tags are essential so that I can understand what the API does, and make it discoverable. As I list out different areas of my API Stack research, like the financial market data APIs, it is critical that I have a title, and description for each provider, but the summary, description, and tags are what provides the heart of the index for what is possible with each API.
When designing an API, as a developer, I tend to just fly through writing summary, descriptions, and tags for my APIs. I’m focused on the technical details, not this “fluff”. However, this represents one of the biggest disconnects in the API lifecycle, where the developer is so absorbed with the technical details, we forget, neglect, or just don’t are to articulate what we are doing to other humans. The summary, description, and tags are the outlines in the API contract we are providing. These details are much more than just the fluff for the API documentation. They actually describe the value being delivered, and allow this value to be communicated, and discovered throughout the life of an API–they are extremely important.
As I’m doing this work, I realize just how important these descriptions and tags are to the future of these APIs. Whenever it makes sense I’m translating these APIs into streaming APIs, and I’m taking the tags I’ve created and using them to define the events, topics, and messages that are being transacted via the API I’m profiling. I’m quantifying how real time these APIs are, and mapping out the meaningful events that are occurring. This represents the event-driven shift we are seeing emerge across the API landscape in 2018. However, I’m doing this on top of API providers who may not be aware of this shift in how the business of APIs is getting done, and are just working hard on their current request / response API strategy. These summaries, descriptions, and tags, represent how we are going to begin mapping out the future that is happening around them, and begin to craft a road map that they can use to understand how they can keep evolving, and remain competitive.
If you think there is a link I should have listed here feel free to tweet it at me, or submit as a Github issue. Even though I do this full time, I'm still a one person show, and I miss quite a bit, and depend on my network to help me know what is going on.