|
|
| |
LWP::Authen::OAuth2::Overview(3) |
User Contributed Perl Documentation |
LWP::Authen::OAuth2::Overview(3) |
LWP::Authen::OAuth2::Overview - Overview of accessing OAuth2 APIs with
LWP::Authen::OAuth2
This attempts to be the document that I wished existed when I first tried to
access an API that used OAuth 2 for authentication. It explains what OAuth 2
is, how it works, what you need to know to use it, and how LWP::Authen::OAuth2
tries to make that easier. It hopefully also explains this in a way which will
help you read documentation written by other people who assume that you have
this knowledge.
Feel free to read as much or little of this document as makes
sense for you. It is not actually designed to be read in a single
sitting.
Since part of the purpose of this document is to familiarize you
with the jargon that you're likely to encounter, all terms commonly used in
discussions of OAuth 2 with a specific meaning are highlighted. Terms
will hopefully be clear from context, but all highlighted terms are
explained in the "Terminology" section.
OAuth 2 makes it easy for large service providers to write many APIs that
users can securely authorize third party consumers to use on
their behalf. Everything good (and bad!) about the specification comes from
this fact.
It therefore specifies an authorization handshake through which
permissions are set up, and then a message signing procedure through which
you can then access the API. Well, actually it specifies many variations of
the authorization handshake, and multiple possible signing procedures,
because large organizations run into a lot of use cases and try to cover
them all. But conceptually they are all fundamentally similar, and so have
been lumped together in one monster spec.
LWP::Authen::OAuth2 exists to help Perl programmers who want to be a
consumer of an API protected by OAuth 2 to construct and make all of
the necessary requests to the service provider that you need to make.
You will still need to set up your relationship with the service
provider, build your user interaction, manage private data (hooks are
provided to make that straightforward), and figure out how to use the API.
If that does not sound like it will make your life easier, then
this module is not intended for you.
If you are not a consumer, this module is definitely
not intended for you. (Though this document may still be helpful.)
OAuth 2 allows a user to tell a service provider that a
consumer should be allowed to access the user's data through an
API. This permissioning happens through the following handshake.
The consumer sends the user to an
authorization_url managed by the service provider. The
service provider tells the user that the consumer wants
access to that account and asks if this is OK. The user confirms that
it is, and is sent back to the consumer with proof of the
conversation. The consumer presents that proof to the service
provider along with proof that it actually is the consumer, and
is granted tokens that will act like keys to the user's account.
After that the consumer can use said tokens to access the API which
is protected by OAuth 2.
All variations of OAuth 2 follow this basic pattern. A large
number of the details can and do vary widely. For example JavaScript
applications that want to make AJAX calls use a different kind of proof.
Applications installed on devices without web browsers will pass information
to/from the user in different ways. And each service provider is free
to do many, many things differently. The specification tries to document
commonalities in what different companies are doing, but does not mandate
that they all do the same thing.
(This sort of complexity is inevitable from a specification that
tries to make the lives of large service providers easy, and the
lives of consumers possible.)
If you want to access an OAuth 2 protected API, you need to become a
consumer. Here are the necessary steps, in the order that things happen
in.
- Register with the service provider
- You cannot access a service provider without them knowing who you
are. After you go through their process, at a minimum you will get a
public client_id, a private client_secret, and have agreed
on one or more redirect_uris that the user can use to
deliver an authorization code back to you. (That is not the only
kind of proof that the user can be given for the consumer,
but it is probably the only one that makes sense for a Perl
consumer.)
The redirect_uri is often a
"https:///..." URL under your control.
You also are likely to have had to tell the service provider
about what type of software you're writing (webserver, command line,
etc). This determines your client type. They may call this a
scenario, or flow, or something else.
You will also need information about the service
provider. Specifically you will need to know their Authorization
Endpoint and Token Endpoint. They hopefully also have useful
documentation about things like their APIs.
LWP::Authen::OAuth2 is not directly involved in this step.
If a LWP::Authen::OAuth2::ServiceProvider::Foo class exists,
it should already have the service provider specific information,
and probably has summarized documentation that may make this smoother.
If you're really lucky, there will be a CPAN module (or modules) for the
API (or APIs) that you want to use. If those do not exist, please
consider creating them.
If no such classes exist, you can still use the module. Just
pass the necessary service provider facts in your call to
"LWP::Authen::OAuth2->new(...)" and
an appropriate LWP::Authen::OAuth2::ServiceProvider will be created for
you on the fly.
- Decide how to store sensitive information
- All of the data shared between you and the service provider has to
be stored on your end. This includes tokens that will let you access
private information for the user. You need to be able to securely
store and access these.
LWP::Authen::OAuth2 does not address this, beyond providing
hooks that you are free to use as you see fit.
- Build interaction asking for user permission
- You need to have some way of convincing the user that they want to
give you permission, ending in giving them an authorization_url
which sends them off to the service provider to authorize access.
This interaction can range from a trivial conversation with yourself if
you are the only user you will be handling, to a carefully thought
through sales pitch if you are trying to get members of the public to sign
up.
LWP::Authen::OAuth2 helps you build that URL. The rest is up
to you.
- Build interaction receiving your authorization code
- When the user finishes their interaction with the service
provider, if the service provider is sure that they know where
to send the user (they know your client_id, your
redirect_uri makes sense to them) then they will be sent to the
redirect_uri to pass information back to you.
If you succeeded, you will receive a code in some way.
For instance if your redirect_uri is a URL, it will have a get
parameter named "code".
You could get an "error"
parameter back instead. See RFC 6749
<http://tools.ietf.org/html/rfc6749#section-4.1.2.1> for a list of
the possible errors. Note that there are possible optional fields with
extra detail. I would not advise optimism about their presence.
LWP::Authen::OAuth2 is not involved with this.
- Request tokens
- Once you have that code you are supposed to immediately trade it in
for tokens. LWP::Authen::OAuth2 provides the
"request_tokens" method to do this for
you. Should you not actually get tokens, then the
"request_tokens" method will trigger an
error.
NOTE that the code cannot be expected to work
more than once. Nor can you expect the service provider to
repeatedly hand out working codes for the same permission. (The
qualifier "working" matters.) Being told this will hopefully
let you avoid a painful debugging session that I did not enjoy.
- Save and pass around tokens (maybe)
- If you will need access to information in multiple locations (for instance
on several different web pages), then you are responsible for saving and
retrieving those tokens for future use. LWP::Authen::OAuth2 makes it easy
to serialize/deserialize tokens, and has hooks for when they change, but
leaves this step up to you.
- Access the API
- LWP::Authen::OAuth2 takes care of signing your API requests. What requests
you need to actually make are between you and the service provider.
With luck there will be documentation to help you figure it out, and if
you are really lucky that will be reasonably accurate.
- Refresh access tokens (maybe)
- The access token that is used to sign requests will only work for a
limited time. If you were given a request token, that can be used
to request another access token at any time. Which raises the
possibility that you make a request, it fails because the access
token expired, you refresh it, then need to retry your request.
LWP::Authen::OAuth2 will perform this refresh/retry logic for
you automatically if possible, and provides a hook for you to know to
save the updated token data.
Some client types are not expected to use this pattern.
You are only given an access token and are expected to send the
user through the handshake again when that expires. The second time
through the redirect on the service provider's side is immediate,
so the user experience should be seamless. However LWP::Authen::OAuth2
does not try to automate that logic. But
"$oauth2->should_refresh" can let
you know when it is time to send the user through, and
"$oauth2->can_refresh_tokens" will
let you know whether automatic refreshing is available.
Note that even if it is available, retry success is not
guaranteed. The user may revoke your access, the service
provider may decide you are a suspicious character, there may have
been a service outage, etc. LWP::Authen::OAuth2 will throw errors on
these error conditions, handling them is up to you.
This section is intended to be used in one of two ways.
The first option is that you can start reading someone else's
documentation and then refer back to here every time you run across a term
that you do not immediately understand.
The second option is that you can read this section straight
through for a reasonably detailed explanation of the OAuth 2 protocol, with
all terms explained. In fact if you choose this option, you will find it
explained in more detail than you need to be a successful
consumer.
However if you use it in the second way, please be advised that
this does not try to be a complete and exact explanation of the
specification. In particular the specification requires specific error
handling from the service provider that I have glossed over, and allows for
extra types of requests that I also glossed over. (Particularly the bit
about how any service provider at any time can add any new method
that they want so long as they invent a new grant_type for it.)
- consumer
- The consumer is the one who needs to be authorized by OAuth 2 to be
able to "consume" an API. If you're reading this document,
that's likely to be you.
- client
- The software on the consumer's side which actually will access the
API. From a consumer's point of view, a consumer and the
client are usually the same thing. But, in fact, a single
consumer may actually write multiple clients. And if one is
a web application while another is a command line program, the differences
can matter to how OAuth 2 will work.
Where I have a choice in this document I say consumer
rather than client because that term is less likely overloaded in
most organizations.
- user
- The user is the entity (person or company) who wishes to let the
consumer access their account.
- Resource Owner
- What the OAuth 2 specification calls the user, to focus attention
on the fact that they own the data which will get accessed.
I chose to say user instead of Resource Owner
because that is my best guess as to what the consumer is most
likely to already call them.
- service provider
- The service provider is the one which hosts the account, restricts
access and offers the API. For example, Google.
- Resource Server
- In the OAuth 2 specification, this is the service run by the service
provider which hosts provides an API to the user's data. The name has
deliberate symmetry with Resource Owner.
- Authorization Server
- In the OAuth 2 specification, this is the service run by the service
provider which is responsible for granting access to the Resource
Server.
The consumer does not need to care about this
distinction, but it exposes an important fact about how the service
provider is likely to be structured internally. You typically will
have one team that is responsible for granting access, tracking down
clients that seem abusive, and so on. And then many teams are
free to create useful stuff and write APIs around them, with
authorization offloaded to the first team.
As a consumer, you will make API requests to the
Resource Server signed with proof of auhorization from the
Authorization Server, the Resource Server will confirm
authorization with the Authorization Server, and then the
Resource Server will do whatever it was asked to do.
Organizing internal responsibilities in this manner makes it
easier for many independent teams in a large company to write public
APIs.
- client type
- The service provider internally tags each client with a
client type which tells it something about what environment it is
in, and how it interacts with the user. Are are the basic types
listed in RFC 6749
<http://tools.ietf.org/html/rfc6749#section-2.1>:
- web application
- Runs on a web server. Is expected to keep secrets. Likely to be
appropriate for a Perl client.
- user-agent-based application
- JavaScript application running in a browser that wants to make AJAX calls.
Can't keep secrets. Does not make sense for A Perl client.
- native application
- Application installed on a user's machine. Can't keep secrets.
Possibly appropriate for a Perl client.
Of course all of this is up to the service provider. For
example at the time of this writing, Google documents no less than six
client types at
<https://developers.google.com/accounts/docs/OAuth2>, none of which
have been given the above names. (They also call them "Scenarios"
rather than client type.) They rename the top two, split native
application into two based on whether your application controls a
browser, and add two new ones.
- flow
- Your flow is the sequence and methods of interactions that set up
authorization. The flow depends on your service provider and
client type. For example the service provider might redirect
the user to a URL controlled by a web application, while instead
for a native application the user is told to cut and paste a code
somewhere.
Despite flow being more common terminology in OAuth 2,
client type is more self-explanatory, so I've generally gone with
that instead.
- client_id
- The client_id is a public ID that tells the service provider
about the client that is accessing it. That is, it says both who
the consumer is, and what the client type is. Being public,
the client_id can be shared with the user. The details of
how this is assigned are between the consumer and the service
provider.
- client_secret
- The client_secret is a somewhat private piece of information that
the consumer can pass to the service provider to prove that
the request really comes from the consumer. How much this is
trusted, and how it is used, will depend on the client type and
service provider.
- redirect_uri
- The service provider needs a way to tell the user how to
pass information back to the consumer in a secure way. That is
provided by the redirect_uri which can be anything from a
"https://..." URL that the
consumer controls to an instruction that lets the service
provider know that it should tell the user to cut and paste
some information.
It is up to the service provider what values of are
acceptable for the redirect_uri, and whether it is a piece of
information that is remembered or passed in during the authorization
process.
- state
- The state is an optional piece of information that can be created
by the consumer then added to all requests as an extra piece of
protection against forgery. (You are supposed to create a random piece of
information for each request, then check that you get it back.) In the
OAuth 2 specification it is optional, but recommended. Depending on the
combination of your service provider and client type, it may
be required.
- scope
- The scope describes what permissions are to be granted. To get
multiple permissions, you need to join the permissions requested with
spaces. Everything else is up to the service provider.
Inside of the service provider, what likely happens is
that the team which runs a given Resource Server tells the team
running the Authorization Server what permissions to their API
should be called. And then the Authorization Server can limit a
given consumer to just the APIs that the user authorized
them for.
- Authorization Endpoint
- The Authorization Endpoint is the URL provided by the service
provider for the purpose of sending requests to authorize the
consumer to access the user's account. This is part of the
Authorization Server.
- response_type
- The response_type tells the service provider what kind of
information it is supposed to pass back. I am not aware of a case where a
Perl client could usefully use any value other than
"code". However there are flows
where other things happen. For example the flow for the
user-agent-based application client type uses a
response_type of token.
While the field is not very useful for Perl clients, it
is required in the specification. So you have to pass it.
- authorization_url
- This is the URL on the service provider's website that the
user goes to in order to let the service provider know what
authorization is being requested.
It is constructed as the Authorization Endpoint with
get parameters added for the response_type, client_id, and
optionally state. The specification mentions both
redirect_uri and scope but does not actually mandate that
they be accepted or required. However they may be. And, of course, a
given service provider can add more parameters at will, and require (or
not) different things by client type.
An example URL for Google complete with optional extensions is
<https://accounts.google.com/o/oauth2/auth?scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.profile&state=%2Fprofile&redirect_uri=https%3A%2F%2Foauth2-login-demo.appspot.com%2Fcode&response_type=code&client_id=812741506391.apps.googleusercontent.com&approval_prompt=force>
In LWP::Authen::OAuth2 the
"authorization_url" method constructs
this URL. If your request needs to include the state,
scope, or any service provider specific parameter, you
need to pass those as parameters. The others are usefully defaulted from
the service provider and object.
- (authorization) code
- If the response_type is set to
"code" (which should be the case), then
on success the service provider will generate a one use
authorization code to give to the user to take back to the
consumer. Depending on the flow this could happen with no
effort on the part of the user. For example the user can be
redirected to the redirect_uri with the code passed as a get
parameter. The web server would then pick these up, finish the handshake,
and then redirect the user elsewhere.
In all interactions where it is passed it is simply called the
code. But it is described in one interaction as an
authorization_code.
- Token Endpoint
- The Token Endpoint is the URL provided by the service
provider for the purpose of sending requests from the consumer
to get tokens allowing access to the user's account.
- grant_type
- The grant_type is the type of grant you expected to get based on
the response_type requested in the authorization_url. For a
response_type of "code" (which is
almost certainly what will be used with any consumer written in
Perl), the grant_type has to be
"authorization_code". If they were being
consistent, then that would be code like it is everywhere else, but
that's what the spec says.
We will later encounter the grant_type
"refresh_token". The specification
includes potential requests that can be in a flow that might
prove useful. However you are only likely to encounter that if you are
subclassing LWP::Authen::OAuth2::ServiceProvider. In that case you will
hopefully discover the applicability and details of those
grant_types from the service provider's documentation.
- Access Token Request
- Once the consumer has a code the consumer can submit an
Access Token Request by sending a POST request to the Token
Endpoint with the grant_type, code, client_id,
client_secret, redirect_uri and (if in the authorization
code) the state. Your service provider can also require you
to authenticate in any further way that they please. You will get back a
JSON response.
An example request might look like this:
POST /o/oauth2/token HTTP/1.1
Host: accounts.google.com
Content-Type: application/x-www-form-urlencoded
code=4/P7q7W91a-oMsCeLvIaQm6bTrgtp7&
client_id=8819981768.apps.googleusercontent.com&
client_secret={client_secret}&
redirect_uri=https://oauth2-login-demo.appspot.com/code&
grant_type=authorization_code
and the response if you're lucky will look something like:
HTTP/1.1 200 OK
Content-Type: application/json;charset=UTF-8
Cache-Control: no-store
Pragma: no-cache
{
"access_token":"1/fFAGRNJru1FTz70BzhT3Zg",
"expires_in":3920,
"token_type":"Bearer",
"refresh_token":"1/xEoDL4iW3cxlI7yDbSRFYNG01kVKM2C-259HOF2aQbI"
}
or if you're unlucky, maybe like this:
HTTP/1.1 200 OK
Content-Type: application/json;charset=UTF-8
Cache-Control: no-store
Pragma: no-cache
{
"error":"invalid_grant"
}
Success is up to the service provider which can decide
not to give you tokens for any reason that they want, including that you
asked twice, they think the user might be compromised, they don't
like the client, or the phase of the Moon. (I am not aware of any
service provider that makes failure depend on the phase of the
Moon, but the others are not made up.)
The "request_tokens" method
of LWP::Authen::OAuth2 will make this request for you, read the JSON and
create the token or tokens. If you passed in a
"save_tokens" callback in constructing
your object, that will be called for you to store the tokens. On future
API calls you can retrieve that to skip the handshake if possible.
- token_type
- The token_type is a case insensitive description of the type of
token that you could be given. In theory there is a finite list of types
that you could encounter. In practice service providers can add
more at any time, either intentionally or unintentionally by failing to
correctly implement the one that they claimed to have created.
See LWP::Authen::OAuth2::AccessToken for advice on how to add
support for a new or incorrectly implemented token_type.
- expires_in
- The number of seconds until you will need a new token because the old one
should have expired. LWP::Authen::OAuth2 provides the
"should_refresh" method to let you know
when you need that new token. (It actually starts returning true slightly
early to avoid problems if clocks are not synchronized, or you begin a
series of operations.)
- access_token
- An access_token is a temporary token that gives the consumer
access to the user's data in the service provider's system.
In the above response the "access_token"
is the value of the token, "expires_in"
is the number of seconds it is good for in theory (practice tends to be
close but not always exact), and
"token_type" specifies how it is
supposed to be used.
Once the authorization handshake is completed, if the
access_token has a supported token_type. then
LWP::Authen::OAuth2 will automatically sign any requests for you.
- Bearer token
- If the token_type is "bearer"
(case insensitive), then you should have a bearer token as
described by RFC 6750 <http://tools.ietf.org/html/rfc6750>. For as
long as the token is good, any request signed with it is authorized.
Signing is as simple as sending an https request with a header of:
Authorization: Bearer 1/fFAGRNJru1FTz70BzhT3Zg
You can also sign by passing
"access_token=..." as a post or get
parameter, though the specification recommends against using a get
parameter. If you are using LWP::Authen::OAuth2, then it is signed with
the header.
- refresh_token
- The above example also included a refresh_token. If you were given
one, you can use it later to ask for a refreshed access_token.
Whether you get one is up to your service provider, who is likely
to decide that based on your client_type.
- Refresh Access Token
- If you have a refresh_token, you can at any time send a Refresh
Access Token request. This is a POST to the Token Endpoint with
the refresh_token, client_id and client_secret
arguments. You also have to send a grant_type of
"refresh_token".
Thus in the above case we'd send
POST /o/oauth2/token HTTP/1.1
Host: accounts.google.com
Content-Type: application/x-www-form-urlencoded
refresh_token=1/xEoDL4iW3cxlI7yDbSRFYNG01kVKM2C-259HOF2aQbI&
client_id=8819981768.apps.googleusercontent.com&
client_secret={client_secret}&
grant_type=refresh_token
and if lucky could get a response like
HTTP/1.1 200 OK
Content-Type: application/json;charset=UTF-8
Cache-Control: no-store
Pragma: no-cache
{
"access_token":"ya29.AHES6ZSiArSow0zeKokajrri5gMBpGc6Sq",
"expires_in":3600,
"token_type":"Bearer",
}
and if unlucky could get an error as before.
In LWP::Authen::OAuth2 this request is made for you
transparently behind the scenes if possible. If you're curious when,
look in the source for the
"refresh_access_token" method. There
are also optional callbacks that you can pass to let you save the
tokens, or hijack the refresh method for your own purposes. (Such as
making sure that only one process tries to refresh tokens even though
many are accessing it.)
But note that not all flows offer a
refresh_token. If you're on one of those flows then you
need to send the user back to the service provider for
authorization renewal. From the user's point of view this is
likely to be painless because it will be done with transparent
redirects. But the consumer needs to be aware of it.
Ben Tilly, "<btilly at gmail.com>"
Thanks to Rent.com <http://www.rent.com> for their generous support in
letting me develop and release this module. My thanks also to Keith Cascio
"<cascio@helminthist.net>" for very
helpful feedback on early drafts.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |