Git Product home page Git Product logo

urlshortener's Issues

Overview cronjon

Overview

This article shows how to manually create a custom cron job using your Shell user. It involves logging into the server via SSH to run several commands.

These instructions can also be used to edit an existing cron job you created in the panel, however for simplicity, it's recommended that if you created it in the panel, you continue to edit it from the panel.

Basic details

The crontab files are where the lists of jobs and other instructions to the cron daemon are kept. Each user at DreamHost has their own individual crontab file that can be access by running the following command under your Shell user:

[server]$ crontab -e

Crontab files are simple text files that have a particular format. Each line of a crontab file follows a particular format as a series of fields, separated by spaces and/or tabs. Each field can have a single value or a series of values. A single cron job should take up exactly one line, but this can be a long line (more than 80 characters).

Things to look out for when editing/creating your crontab

Each line has five time/date fields, followed by a command, followed by a newline character ('\n'). A common problem is not including a newline, so hit 'Enter/Return' a time or three at the end of your command.

Another common problem is automatic word-wrap breaking up a long line into multiple lines, so make sure your text editor doesn't do this.

Blank lines and leading spaces and tabs are ignored. Lines whose first non-space character is a hash-sign (#) are ignored as they are considered comments. Note that comments are not allowed on the same line as cron commands, since they are interpreted as being part of the command. Similarly, comments are not allowed on the same line as environment variable settings (like MAILTO).

What if I already created a cron job in the panel under my Shell user?

There are two ways to create custom cron jobs.

  • Editing the existing crontab on the server
  • Using a custom crontab file

If you've edited the existing crontab

If you have already created a cron job in your panel, you can view it by running crontab -e under your Shell user. If you edit the file to add another cron job below the existing panel one, the panel cron job will continue to function normally in addition to your new edited code.

Any adjustments in the panel will not affect your custom code.

If you switched the server's crontab with your custom crontab

You can also use a custom crontab that you created. If you do this, the server's crontab is overwritten. You can replace the server's crontab by running the following:

[server]$ crontab /home/username/mycrontab

However, if you then add or edit a cron job in the panel, your custom crontab will be overwritten. So you need to either use the server's crontab, or your custom crontab.

Manually creating a custom cron job

The instructions below explain how to add a custom cron job under your Shell user. These instructions assume you have NOT added a cron job in the panel yet, so the crontab file is blank.

  1. Log into your server via SSH using the Shell user you wish to create the cron job under.
  2. Once logged in, run the following command to open your crontab file.
[server]$ crontab -e
no crontab for example_username - using an empty one

Select an editor.  To change later, run 'select-editor'.
  1. /bin/ed
  2. /bin/elvis-tiny
  3. /bin/nano        <---- easiest
  4. /usr/bin/emacs24
  5. /usr/bin/jed
  6. /usr/bin/jmacs
  7. /usr/bin/joe
  8. /usr/bin/jpico
  9. /usr/bin/jstar
  10. /usr/bin/mcedit
  11. /usr/bin/rjoe
  12. /usr/bin/vim.basic
  13. /usr/bin/vim.tiny

Choose 1-13 [3]: 3
  1. You are then asked to choose an editor to view this file. #3 uses the program 'nano' which is the easiest option. View the 'Creating and editing a file via SSH' article for instructions on how to use nano.
  2. You are presented with this new crontab file:
# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h  dom mon dow   command
  1. At the bottom, add the code for your cron job. This example runs a file named mail.php under the username of 'example_username'. This should be the same username you're currently logged in under. This example runs the cron job at 8:13 pm.
# Custom cron job
MAILTO="[email protected]"
13 20 * * * php  /home/example_username/mail.php

There are two parts to the command. The first part must specify the path to the program you'd like to use to run the cron job. For example let's say you have a PHP file named script.php in your domains directory:

  • /home/username/example.com/script.php

To run this command you'd enter the path to your chosen version of PHP followed by a space, followed by the path to the file:

  • /usr/local/php71/bin/php /home/username/example.com/script.php

You could also use the default version by using 'php' instead of the full path.

  1. Save the file. You should see the following response:
crontab: installing new crontab

That's it. The cron job should now run every day at 8:13pm.

Crontab commands

Please note that if you choose to replace the server's crontab, all cron jobs created in the panel for this specific username will no longer function since they would have been overwritten on the server.

Additionally, if you update any cron jobs under this user in the panel, it will overwrite your custom crontab. The crontab will be replaced in its original form as created in the panel.

Replace your existing crontab with your custom crontab file

[server]$ crontab /home/username/filename

Edit your server's crontab

[server]$ crontab -e

View your crontab

[server]$ crontab -l

Remove your crontab

[server]$ crontab -r

Explanation of the Date/Time fields

The first five fields of the line are the date and time field which specify how frequently and when to execute a command. When adding the cron job in the DreamHost panel, the Date/Time is added for you automatically based on your 'When to run' setting.

Field no. Description Permitted values
1 minute 0-59
2 hour 0-23
3 day of the month 1-31
4 month 1-12
5 day of the week 0-7

For day of the week, both 0 and 7 are considered Sunday. The time is based on that of the server running cron.

Another (graphical) way of looking at these fields.

 # * * * * *  command to execute
 # │ │ │ │ │
 # │ │ │ │ │
 # │ │ │ │ └───── day of week (0 - 6) (0 to 6 are Sunday to Saturday, or use names; 7 is Sunday, the same as 0)
 # │ │ │ └────────── month (1 - 12)
 # │ │ └─────────────── day of month (1 - 31)
 # │ └──────────────────── hour (0 - 23)
 # └───────────────────────── min (0 - 59)

There are several ways of specifying multiple values in these fields:

  • The comma (',') operator specifies a list of values.
    • 1,3,4,7,8
  • The dash ('-') operator specifies a range of values.
    • 1-6
    • This is equivalent to "1,2,3,4,5,6".
  • The asterisk ('*') operator (frequently known as a wildcard) specifies all possible values for a field. For example, an asterisk in the hour (second) field would be equivalent to 'every hour'.
  • The slash ('/') operator can be used in conjunction with an asterisk to skip a given number of values. Example:
    • /3
    • This means to skip to every third value. So "/3" in the hour field is equivalent to "0,3,6,9,12,15,18,21"; "" specifies 'every hour' but the "/3" means that only the first, fourth, seventh, etc. values given by "*" are used.

You can also use one of these special strings in place of the time/date fields.

Entry Description Equivalent to Example
@Yearly (or @annually) Run once a year at midnight on January 1 0 0 1 1 * @Yearly php /home/example_username/mail.php
@monthly Run once a month at midnight on the first day of the month 0 0 1 * * @monthly php /home/example_username/mail.php
@Weekly Run once a week at midnight on Sunday morning 0 0 * * 0 @Weekly php /home/example_username/mail.php
@daily (or @midnight) Run once a day at midnight 0 0 * * * @daily php /home/example_username/mail.php
@hourly Run once an hour at the beginning of the hour 0 * * * * @hourly php /home/example_username/mail.php
@reboot Run at startup (of the cron daemon) @reboot @reboot php /home/example_username/mail.php

Review the following Wikipedia article for further information:

Output

The output of the cron job is determined by what is sent to the terminal as a result of the commands/script that are executed. By default, all output is emailed to the location specified in the MAILTO variable (see the MAILTO variable requirement section for more information). As noted above, if your cron job command doesn't create any output on the command line then no email is sent.

You can provide special instructions for the standard out (STDOUT) and standard error (STDERR) output by using the ">" operator. When you use ">" without a number before it, it defaults to "1>". This is the standard (non-error) output.

When you use "2>" you are specifying what to do with the error output. So, for example, ">my_file.txt" would redirect standard output to a file called "my_file.txt", and "2>my_errors.txt" would redirect the errors to a file called "my_errors.txt".

Permissions

By default, files created on DreamHost's servers have a permissions level of 644. If you choose to execute a script via cron, you may need to set the permissions for the file to 744 using chmod in order to allow it to execute properly.

Examples of custom cron scripts

The following examples show what you could add to to a new file in order to create a cron job.

Example 1: This runs a command at 4:10 PM PDT/PST, and emails you the regular and error output to the destination specified by MAILTO.

10 16 * * * perl /home/username/bin/yourscript.pl

Example 2: This runs a command at 2:00 AM PDT/PST on Saturday, and the only output is errors.

0 2 * * 6 sh /home/username/weekly/weekly-pruning.sh > /dev/null

Example 3: This runs at midnight on New Years Day (January 1st), and there is no output.

0 0 1 1 0 python /home/username/happy.new.years.py >/dev/null 2>&1

2>&1 is a special redirect that sends the standard error (“2>”) output to the same place as the standard out (“>” or “1>”) output.

Example 4: This runs a PHP script called cron.php at the top of every hour.

0 * * * * php /home/username/cron.php

Example 5: This runs a local script (i.e. hosted at DreamHost) every 15 minutes.

*/15 * * * * /usr/local/php71/bin/php /home/example_username/myscript.php

Example 6: This runs an external script (i.e. hosted elsewhere) every 30 minutes using curl.

*/30 * * * * /usr/bin/curl -s http://example.com/send.php &> /dev/null
&>/dev/null is an abbreviation for 1> /dev/null 2> &1. It redirects both file descriptor 2 (STDERR) and descriptor 1 (STDOUT) to /dev/null.

View http://unix.stackexchange.com/a/70971 for more information.

Example 7: This runs a local script (i.e. hosted at DreamHost) every 10 minutes.

*/10 * * * * /usr/local/php71/bin/php /home/example_username/myscript.php

Example 8: This uses wget to download a file to a directory named /cronfolder.

*/10 * * * */usr/bin/wget -P /home/username/cronfolder/ https://example.com/index.html

Dedicated server editing

If you are logged in as a Dedicated server admin user, you can edit the crontab file directly. It is stored here:

/var/spool/cron/crontabs/youruser

You’ll need to use sudo on your Dedicated server (or start an interactive session as the root user with sudo -i) to access that file.

If you require sudo/admin access, you must upgrade to a Dedicated server.

Example (opening the file with the 'vi' text editor):

[server]$ sudo vi /var/spool/cron/crontabs/youruser

RESTful API

1. Introduction

When following the guideline of this document the resulting API(Application Programming Interface) will reach Level 1 of the Richardson Maturity Model ([1], [4]). That means a resource model has been provided, use of proper HTTP(S) methods has been made, use of appropriate HTTP headers have been identified, and HTTP status codes are used in responses.

The top level - Level 3 - of the Richardson Maturity Model will not be reached. This third level assumes to make use of hypermedia controls: such controls allow REST (Representational State Transfer)servers to inform REST clients about the APIs that may be invoked in the current state of the application. While this is promising in terms of, for example, maintainability (e.g. APIs may be changed without clients having to understand these changes), no best practices have been established yet to deal with this.

2. Overall Approach

The following major steps should be followed to create a RESTful (Level 2) API (see Figure 1).
pic

First, a model of the data to be manipulated by the API is created. From this data model the resources of the API will be determined; typically, there is no one-to-one correspondence between data model elements and resources of the API, especially because new kinds of resources will typically be derived. For each of the resources the representations supported by the APIs have to be determined. Next, these resources must be named properly by means of URIs (Uniform Resource Identifiers). For each of the resources the HTTP methods used to perform the required application functions have to be decided; this includes the use of applicable HTTP headers. Special behavior required by the application (e.g. concurrency control, long running requests) has to be decided. Finally, potential error situations have to be identified and corresponding error messages have to be designed.

Note: Although Figure 1 sketches the approach as a sequential process, following it stepby-step is not always needed. For example,

The data model may already be known. In this case, the first step will be omitted.
The resource model has already been decided. In this case, the second step may be left out. But in case the data model is not precisely specified, it may be worth to perform step 1.
Your API is very straight-forward, e.g. you don't expect concurrent updates of your resources, or none of your APIs kick-off long-running actions. Then you will leave out the step to "determine special behavior".
Each individual step of the overall approach will be detailed in the next sections.

3. Data Model

The data model behind an API can be specified by any conceptual data modeling language like the entity-relationship Model or UML Class Diagrams. In what follows we assume the use of the entity-Relationship Model.

The main purpose of the data model behind an API is to specify the properties of the resources manipulated by an API in an abstract (i.e. implementation independent) manner. By specifying the attributes of the entity types of the data model no early decision is made about the format and media type in which instances of the entity types (aka representations in REST) are exchanged - this decision will be made later, and it can be changed during the lifetime of an API. Thus, it results in more flexibility in the development process.
pic

4. Resource Model

The resource model specifies the resources that are processed by the API. Several kinds of resources will be derived from both the data model as well as the corresponding processing requirements.

4.1 Atomic Resources

The most basic decision to be made for deriving resources is identifying entities of the data model that are exchanged as a whole via the API. Such entities become atomic resources.

For example, based on the sample data model in Figure 2 the Customer entity will become such an atomic entity. This is because details about a customer like his address, payment information etc. will be accessed in several scenarios supported by the API.

4.2 Collection Resources

The next decision to be made is whether atomic resources of the same type are needed to be grouped into a set. Such bundles become collection resources.

For example, products will become a collection resource because the application supports a catalogue that allows browsing through (subsets of) all products available. Note, that there is no products entity type in the data model. Because the application requires such a collection resource we derive it from the data model and give it a new name that corresponds to the plural of the name of the grouped entity type.

As another example, items will become a collection resource that represent all items contained in a shopping cart of a certain customer. This collection will be scoped, i.e. only the items in a specific shopping cart are of interest but not the set of all items in all shopping carts (see section 5.6 for more details on scoping).

Finally, whenever the API supports the creation of an instance of one of the entity types of the data model, this entity type results in a corresponding collection resource. For example, a new customer may register with the application resulting in a new instance of the Customer entity type. Thus, Customers will become a collection resource.

4.3 Composite Resources

Sometimes, instances of groups of different entity types are manipulated as a whole because these instances are perceived as aggregates, e.g. they are typically collectively retrieved or deleted. Such groups become composite resources.

For example, a Shopping Cart is a composite resource because it is often retrieved or deleted as a whole, i.e. with all of its encompassed items.

4.4 Controller Resources

Controller resources are used when multiple resources have to be manipulated in a single API call in order to maintain data consistency. If integrity rules between resources must be obeyed, a client would have to understand these rules like the order in which resources are to be manipulated. By providing a controller resource to manipulate these resource in a single API call relieves the client from having to understand these rules - a significant contribution to loose coupling.

For example, deleting each individual item of a shopping cart one after the other may result in consistency problems in case an error occurs after having deleted only the first few items while others are still left in the shopping cart: a customer requesting the shopping cart just at this point in time of failure will realize a "broken" shopping cart.

Another example is the update of two account resources to realize a funds transfer - the classical motivation for ACID (Atomicity, Consistency, Isolation, Durability) transactions. Each of these two accounts is an atomic resource, i.e. controller resources are different from composite resources.

4.5 Processing Function Resources

Processing function resources (aka computing resources) provide access to functions that either process particular resources, or that perform certain resource independent computations. In practice, processing function resources are often used for predefined partial updates of a resource.

For example, changing the status of a resource like the price of a product, or getting the official exchange rate between two currencies can be realized by means of a processing function resource.

Note: Partial updates are addressed by the HTTP PATCH method [7]. The problem with PATCH is two-fold:

The PATCH method is not (yet) supported by all web servers. Of course, this problem may go away.
The syntax and semantics for each use of the PATH method must be crisply defined:
The resource enclosed in the body of a PATCH method is an instruction document, i.e. a set of instructions precisely describing what has to be updated and how, and all of these instructions must be performed atomically. Especially, the media type of this instruction document is typically different from the media type of the resource that is to be modified by the PATCH. The instruction document may be perceived as a sort of transaction on the resource to be manipulated.

This results in broad exploitation of processing function resources for realizing partial updates.

4.6 Interpreting Relationships

One of the basic problems of deriving a resource model from a data model is in interpreting the relationships between entity types of the data model.

If manipulating instances of a certain entity type does not require the traversal of its associated relationships, then such an entity type is a candidate for an atomic resource: if the instances have to be available via the API, the entity type is transformed into an atomic resource in a one-one correspondence.

Collection resources may be subject to an interpretation of its associated relationship types, i.e. collections may only make sense as children of other resources (so-called scoped collections - see 5.6).

For example: The Products collection resource is not scoped, i.e. this collection is a first class resource of the sample resource model. In contrast to this, the Items collection resource in fact is scoped: the collection of all items in all shopping carts is typically not of interest at all, but all items within a certain shopping cart is of interest. Thus, collections of Items dependent on a certain shopping cart is a collection resource (see section 5.6 how to denote such scoped collections).

If the APIhas to support the direct creation of instances of an entity type, this entity type results in a collection resource. This is because in the REST paradigm a collection resource is a factory for its members (see section 7.3). The entity type itself is the basis for atomic resources that are the members of the collection resource.

Processing function resources as well as controller resources result from functional requirements: they are typically not immediately derived from the data model but from update requirements or from requirements to derive information that may not even be related to some other resources.

5. Resource URIs

The complete URI of an API complies to the following structure:

{scheme}://{host}/{base-path}/{path}[?{query-string}]

5.1 Proper Naming

Proper naming of resources is key for an API to be easily understandable by clients. There are a few rules that should be followed ([2], [3], [5]):

Atomic resources, collection resources and composite resources should be named as nouns because they represent "things", not "actions" (actions would lean more towards verbs as names of resources).
Processing function resources and controller resources should be named as verbs because they in fact represent "actions".
Processing function resources and controller resources should not be sub-resources of individual other resources.
They should not be named by means of a URI template (see 5.6).
Individual resources become parameters.
Lower case characters should be used in names only because the rules about which URI element names are case sensitive and which are not may cause confusion.
If multiple words are used to name a resource, these words should be separated by dashes (i.e. "-").
Especially, no underscore (i.e. "_") should be used: when names are rendered in browsers, they will be interpreted as links, i.e. shown underlined, and, thus, the underscore will be difficult to read.
Similarly, camel case or other programming language related naming should be avoided.
Singular nouns should be used for naming atomic resources.
Names of collections should be "pluralized", i.e. named by the plural noun of the grouped concept (atomic resource or composite resource).
Use forward slashes (i.e. "/") to specify hierarchical relations between resources. A forward slash will separate the names of the hierarchically structured resources. The parent name will immediately precede the name of its immediate children.

5.2 Schemes

A scheme denotes the transport protocol supported by the API. Typically, WSO2 APIs will be all accessible over HTTPS, some APIs may support HTTP, and some may support both schemes.

5.3 Host

The host part of the API specifies the domain of the API. For WSO2 hosted APIs, this value is apis.wso2.com. When hosted by or for customers this will be substituted by a customerspecific string.

5.4 Base Path

The base-path of an API follows the structure

/{feature-code}/[ {sub-code}/ ]/{version}

Thus, each base path consists of a feature-code structure indicating the feature for which this API is for. An optional sub-code structure may be used for features containing logically independent collections of functionalities. For example, the feature-code may be "apim" for API manager, in which case no sub-code is used, and the version may be v1.0 (see section 5.5 for details on versioning). For Enterprise Store, the feature-code may be "es", and the independent Publisher functionality may get the "publisher" sub-code assigned.

The base path will be the same for all APIs of certain features or logically independent collections of feature functionalities, respectively. This base path will precede each proper resource name of the API, i.e. the path element of the API's URI. Note, that a path is often encoded as a URI template (see section 5.6).

5.5 Versioning

The version of an API is specified as part of its URI. This version is specified as a pair of integers (separated by a dot) referred to as the major and the minor number of the version, preceded by the lower case letter "v". E.g. a valid version string in the base path would be v2.1 indicating the first minor version of the second major version of the corresponding API.

Using this versioning scheme is referred to as semantic versioning [6]. In general, a version number following the semantic versioning concept has the structure major.minor.patch and the significance in terms of client impact is increasing from left to right:

An incremented patch number means that the underlying modification to the API cannot even be noticed by a client - thus, the patch number is omitted from the version string. Only the internal implementation of the API has been changed while the signature of the API is unchanged. From the perspective of the API developer, a new patch number indicates a bug fix, a minor internal modification, etc.
An incremented minor number indicates that new features have been added to the API, but this addition must be backward compatible: the client can use the old API without failing. For example, the API may add new optional parameters or a completely new request.
An incremented major number signals changes that are not backward compatible: for example, new mandatory parameters have been added, former parameters have been dropped, or complete former requests are no longer available.
It is best practice to support the current major version as well as at least one major version back. In case new versions are released frequently (e.g. every few months) more major versions back have to be supported. Otherwise, clients will break (too fast).

When a client is using an API's URI with a version number no longer supported, the server has to respond with the following response message that especially contains a Location header field with the URI of the latest version of the API:

HTTP/1.1 301 Moved Permanently Location:

Note: there is a lot of debate on the subject of how to specify versions, and a couple alternative approaches to this area are presented. The discussion spans the whole spectrum from what the pure REST style considers a resource to what the big players providing Web APIs offer. What is recommended here is a pragmatic guideline.

5.6 URI Templates

A URI template is an element which contains strings in curly brackets [8]. These strings are variables that must be substituted by values when such a URI template is used by a client.

For example, when using the URI template

/shopping-carts/{shopping-cart-id}/{item-id}/product

"shopping-cart-id" and "item-id" are variables. These variable must be substituted when an API using this URI template in its path should be used.

A typical use of URI templates is in collection resources. An individual member of a collection will be identified by a unique value that will become the variable in the template immediately following the name of the corresponding collection resource. For example:

/products/{product-id}

Note the difference between this URI template and the one before. This URI template denotes a collection that is immediately derived from an entity type of the data model; all instances of this entity type should be accessible from the API, and they are grouped into the collection for that purpose.

The URI template before is different in the sense that it has been abstracted from a relation type of the data model, namely the relation between the Shopping Cart entity type and the Item entity type. Since there is no need to access the set of all instances of the Item entity type, no dedicated items collection type is part of the resource model. Instead, only items associated with a certain shopping cart (identified by its shopping-cart-id) should be accessible by the API, i.e. this set of items is scoped by the corresponding shopping cart: the collection of items is called a scoped collection.

5.7 Query String

By definition, an optional query string (if specified) is a part of the URI contributing to the unique identification of a resource. i.e. the URI without the query string and the URI with the query string identify different resources (a fact that is often ignored).

However, it is best practice to not use fields of the query string as identifier components. In this sense, a query string provides parameters to control the execution of the API processing the resource identified by the structure preceding the "?" symbol:

{scheme}://{host}/{base-path}/{path}?{query-string}

The query string consists of a sequence of name/value pairs that are separated by an "&". The name-string and the value-string are separated by a "=".

In practice, URIs are limited in size. Even worse, the maximum size supported by products differ. As a consequence, parameters of large query strings have to be moved into the message body of the corresponding requests. Note, that this will only work for requests that allow a message body (especially POST, but not GET).

6. Representation Specification

A format in which instances of the entity types of the data model are exchanged is referred to as a representation of such an instance. A representation is some sort of the shape of a resource, not the resource itself. In this sense, a representation is some sort of view onto the resource.

The information content of an atomic resource or a composite resource is immediately defined by the data model underlying an API. The values of the attributes and - if appropriate - the identifiers of associated resources make up the information content of an atomic resource. Similarly, the aggregate of the information contents of the resources of a composite resource is the information content of the composite resource.

A data structure must be decided for this information content. In addition, one or more renderings of this data structure must be decided. For example, a certain data structure may be rendered as a JSON (JavaScript Object Notation) document or an XML (eXtensible Markup Language) document. Typically, renderings of data structures are specified by means of MIME types. A specific rendering of a data structure is referred to as the representation of the resource, i.e. a representation has a MIME (Multipurpose Internal Mail Extensions) type. Keep in mind, that this MIME type does not indicate the data structure of the information exchanged between the client and the API implementation but only the rendering.
Note: In most practical situations, a single representation (e.g. JSON) for all entities exchanged via an API suffice.

7. HTTP Methods Used

Manipulation of resources in the REST style is done by create, retrieve, update, and delete operations (so-called CRUD operations), that map to the HTTP methods POST, GET, PUT, and DELETE.

A request that can be used without producing any side-effect is called a safe request. A request that can be used multiple times and that is always producing the same effect as the first invocation is called idempotent.

7.1 Get

GET is in HTTP as well as in the REST style specified as a safe and idempotent request. Thus, an API using the GET method must not produce any side-effects. Retrieving an atomic resource (4.1) or a composite resource (4.3) is done by performing a GET on the URI identifying the resource to be retrieved.

Retrieving a (subset of) resources of a certain type is done by performing a GET on the URI of the collection resource (4.2) of that type, and specifying a filter condition (10.2).

7.2 Put

PUT substitutes the resource identified by the URI. Thus, the body of the corresponding PUT message provides the modified but complete representation of a resource that will completely substitute the existing resource: parts of the resource that have not changed must be included in the modified resource of the message body. Especially, a PUT request must not be used for a partial update. As a consequence, PUT is an idempotent request (but not safe).

Partial updates, i.e. updates that modify only selective pieces of an existing resource have to be realized by means of corresponding processing function resources (see section 7.3).

Note: PUT may be used to create a new resource, but this is not recommended. The reason is that in this case, the client is in charge of creating such a globally unique URI identifying the newly created resource - and ensuring uniqueness of identified is a difficult task. In contrast, using POST to create new resources relieves the client from creating unique identifiers because the server will create the URI and return it to the client (see section 7.3)

7.3 Post

POST is neither safe nor idempotent. The main usages of POST are the creation of new resource, and the initiation of functions, i.e. to interact with processing function resources (4.5) as well as controller resources (4.4).

In order to create a new resource, a POST request is used with the URI of the collection resource to which the new resource should be added. If the POST is processed successful, the response message will especially include a Location header that will have the newly created URI of the added resource as value. Also, it is a good practice to return in the response message a Last-Modified header containing the time the resource has been created, as well as the ETag (Entity Tag) header containing the entity tag of the new resource.

It is often appropriate that the client checks the correctness of the created resource. For this purpose, the response message body contains the resource as it would be returned by retrieving it from the URI of the Location header. In this case, the response message also includes a Content-Location header repeating the URI of the newly created resource.

7.4 Delete

A resource is deleted by means of the DELETE request on the URI of the resource. Once a DELETE request returned successfully with a "200 OK" response, following DELETE requests on the same URI will result in a "404 Not Found" response because there is no resource available with the URI of the deleted resource.

Note: By definition, DELETE is an idempotent request, which has the following curious theoretical implication. Responding with "200 OK" to the first DELETE and with "404 Not Found" for any further DELETE on the same URI is not quite RESTful because the responses of the first and all further requests are different; thus, the request is not idempotent. In order to fully comply to the REST style, a server would have to maintain all URIs of deleted resources in order to always respond with "200 OK", i.e. with the same response. This is considered to much of an effort for nearly no gain, i.e. this is not implemented in practice.

8. Headers

HTTP headers provide the vehicle for many non-functional properties of REST APIs. The following list of HTTP headers are used in most APIs.

8.1 Request Headers

Accept
This is the list of content types acceptable for the client.

Authorization
The credentials of the client for authentication by the server.

Content-Type
This is the MIME-type of the message body of the PUT or POST request.

If-Match
Used to avoid concurrency conflicts: if the client-passed entity tag is identical to the entity tag of the resource at the server-side, the request if performed.

If-Modified-Since
Used to avoid retrieving data that has been cached by the client. If the client-provided timestamp is identical to the time the entity has been modified last at the server side no message body is returned.

If-None-Match
Used to avoid retrieving data that has been cached by the client. If the client-provided entity tag is identical to the entity tag of the resource at the server side no message body is returned.

If-Unmodified-Since
Used to avoid concurrency conflicts. if the client-passed last-modified time stamp is identical to the time the resource has been changed last at the server-side, the request is performed.

8.2 Response Headers

Content-Location
The URL of the message body. For example, the URL of the resource describing the status of a long running request (see 10.6).

Content-Type
The MIME-type of the message body.

ETag
A "fingerprint" of the resource as currently available at the server, often a digest of the resource.

Last-Modified
The timestamp when the resource has been modified the last time at the server.

Location
The URL of a newly created resource.

WWW-Authenticate
An indication of the authentication scheme to be used to access the resource.

9. Status Codes

HTTP status codes [9] are returned by response messages and provide key information to clients about the status of a request. The following status codes are used in many APIs.

200 OK
The request has been performed successfully. If the request was a GET, the requested resource is returned in the message body. If the request was a POST, the result of the requested action is described by the message body, or it is contained in the message body.

201 Created
The request has been performed successfully. The URL of the newly created entity is contained in the Location header of the response. An ETag header should be returned with the current entity tag of the resource just created. The response may also contain an entity corresponding to the created resource.

202 Accepted
The processing of the request has started but will take some time (see 10.6). The success of the processing is not guaranteed and should be checked by the client. The body of the response message should provide information about the current state of the processing, as well as information about where the client can request updated status information at a later point in time; typically, the Content-Location header of the response contains a URL where this status information can be retrieved via GET.

303 See Other
The response of the request is available at a different URL; this URL is given as value of the Location header of the response message. Typically, this status code is returned after the processing of a long running request is completed and the client retrieves the status of the long running request (see 10.6).

304 Not Modified
The requesting client has already received the latest version of the requested resource. Thus, the body of the response message must be empty. This status code is returned as a result of a conditional GET (see 10.4), and the the specified conditions (i.e. If-Non-Match, IfModified-Since) are not met.

400 Bad Request
The request is invalid. For example, syntax errors in expressions passed with the request are found, values are out of range, required data is missing etc.

401 Unauthorized
The request requires client authorization or the passed credentials are not accepted. The response must include a WWW-Authenticate header. The request may be repeated by the client including proper credentials in the Authorization headers.

403 Forbidden
The server understood the request but refused to perform it. For example, the request must be conditional but no condition has been specified.

404 Not Found
The requested entity does not exist.

406 Not Acceptable
The requested media type is not supported. For example, a GET request wants to retrieve an entity in a media type (specified as value of the Accept header of the request) not supported by the server.

412 Precondition Failed
The request has not been performed because one of the preconditions has not been met. This status code is returned when the request was conditional (see 10.5) and one of the conditions specified (i.e. If-Match, If-Unmodified-Since) is not met.

415 Unsupported Media Type
The entity passed by the request was in a format that is not supported. For example, a PUT request passed an entity in its body, and this entity was in a format or media type, respectively, that is not understood by the server.

Note: 5xx status codes denote severe errors at the server side or the network, or denote not implemented functions, etc. Such errors are very generic, i.e. there is no need to document them explicitly for an API.

10. Special Behavior

Except for very simple APIs, a REST API has to offer features that allows to cope with advanced situations like large result sets, concurrent updates, or long running requests. The following describes best practices to deal with some of those special situations.

Note: It is a good practice for an API to support at least queries (see 10.2) and pagination (see 10.3). For example, this will allow to support push-down of filtering, etc. from an API orchestration (a more and more important API technology [10]) to the individual APIs as optimization of response time, bandwidth usage, etc

10.1 Content Negotiation

The REST style clearly distinguishes between a resource itself (i.e. as an abstract entity) and its different possible representations. Such a representation is a rendering of the resource's information content in the format of a certain media type. Which of the representation of a resource is returned as content of the message body is negotiated between a client and the resource provider (a.k.a. content negotiation).

Server-driven content negotiation assumes that a server processing a request determines the representation best suited to the requesting client. The server is dependent on the requesting client to specify information about its processing capabilities or requirements. The latter is done by means of header fields of the request message like Accept, AcceptEncoding etc.

In client-driven content negotiation the server detects that it has more than one representation of a resource available that may serve the client's needs. The server responds to the request with a "300 Multiple Choices" message that carries information about the representations available, and the client finally selects one of these representations and explicitly requests it in a following request.

Server-driven content negotiation has the advantage of avoiding a second request to be made by the client, but has the disadvantage of potentially responding a representation to the client that is not ideally suited. Client-driven content negotiation has the advantage that the client gets the representation that is best suited for it, but the disadvantage of two round-trips.

In the following request, the client specifies that it is able to process JSON, XML as well as plain text, but that it prefers JSON. The preferences in media types is specified by weighting a media type by a quality value q. The server will determine which of the representations it has available and will return the one with the highest quality value.

Example: GET /products/27182
Accept: application/json;q=0.9, application/xml;q=0.6, text/plain;q=0.1

In case the server does not have a client specified representation, it will respond with a message returning a corresponding status code:

HTTP/1.1 406 Not acceptable

Note: Even in case an API document supports only one certain media type, a client may pass a request that contains an Accept header with a different media type. In this case, the API implementation must return the "406 Not acceptable"message.

10.2 Queries

A query on a collection resource consists of three artifacts: (i) a mandatory filter condition, (ii) an optional sort expression, and (iii) an optional projection. First, a list of attributes from the entity type that can be used in either of these artifacts on which the collection is based on has to be distinguished.

A filter condition is a boolean expression in these attributes. The spectrum of filter conditions is from simple (supporting to specify a single attribute name with a value being compared for equality) to complex (arbitrary conditions and arbitrary comparison operators).

Example of a complex query: filter=((price > 1000 AND status = on-stock) OR (price < 200 AND NOT(status = on-stock)))

A sort expression consists of a list of attribute names together with the indication for each attribute name whether the result is to be sorted ascending or descending with respect to the corresponding attribute. The order in which the result is sorted is implied by the order of the list of attribute names.

Example: sort=(price ASC, delivery-date DESC)

A list of attribute names that have to be used to create each item in the result set is called a projection. Each specified attribute name is used to extract the corresponding information from each resource qualified and compile a corresponding item in the result set.

Example: projection=price,color,status

There are two ways to enable queries on collections. First, a query is part of the query string of the URI used by a GET. Second, the query is passed in the body of a POST request of a special processing function resource.

An example of specifying a query as query string of the URI is: GET /products?status=on-stock&sortAsc=price&projection=price,color,status HTTP/1.1

Specifying a query as part of a query string concatenated to the URI of the collection resource that is enquired is the preferred way for queries. This is because a GET is used with this URI, clearly expressing the semantics of the request namely retrieving a subset of the collection. However, in practice URIs have a maximum length depending on the browser or Web server used. When queries may become thus complex that the maximum length may be exceeded, the use of a POST request that contains the (complex) query in the request body is enforced. For this purpose, a separate processing function resource is to be realized:

POST /product-search HTTP/1.1

filter=((price > 1000 AND status = on-stock) OR (price < 200 AND NOT(status = on-stock)))
sort=(price ASC, delivery-date DESC)
projection=price,color,status

10.3 Pagination

When a large collection resource (or subsets of it) is to be retrieved, it is often convenient to retrieve the result set in smaller chunks: for example, the latency of the request is reduced, clients can predict the amount of data to be dealt with, etc.

For this purpose, the retrieval request specifies a query string containing an "offset" field as well as a "limit" field; the offset is the position number of a qualified resource where the retrieval should start, and the limit is the maximum number of resource to be returned.

The response message with the subset of the qualified resource returned should specify the total number of all qualified resources ("count" field), a link to the next chunk of qualified resources ("next" field), as well as a link to the previous chunk of qualified resources ("previous" field). The actual format of how these fields as well as the set of resources is returned, is application specific. The following example is intended to show the principle only.

Example: GET /products?offset=42&limit=3

Response: HTTP/1.1 200 OK

count=119
next={link-to-next-subset}
previous={link-to-previous-subset}
P42-details
P43-details
P44-details

10.4 Client-Side Caching

A client may cache resources as well as header fields of resources to reduce transfer of data that has not been changed since its last retrieval. For this purpose, a client uses a conditional request to retrieve data. Such a conditional request specifies the If-None-Match or the If-Modified-Since headers in the request.

The value of the If-None-Match header is the value of the ETag header of the resource as retrieved last time by the requesting client. The value of the If-Modified-Since header is the value of the Last-Modified header of the resource as retrieved last time by the requesting client. When performing the conditional request, the API implementation has to compare the value(s) passed by the client in the request with the corresponding values of the current resource as stored by the server (note3 that If-None-Match takes precedence over If-Modified-Since). If the values have not changed, the resource is not returned (response code "304 Not Modified"); otherwise, the resource is returned.

Example: GET /products/31415
If-None-Match: "42049dcaf450987cffd"
If-Modified-Since: Thu, 7 Jan 2016 17:41 CET

Response: HTTP/1.1 304 Not Modified

10.5 Concurrency Control

Multiple clients interacting with the same resource may cause concurrency conflicts like lost updates. Pessimistic concurrency control mechanisms avoid conflicts in advance by locking resources. This is a good choice, if the probability for conflicts is high. Optimistic concurrency control avoids conflicts by detecting conflicts and signaling them to clients that may retry their requests. This is a good choice, if the probability for conflicts is low.

Many concurrent interactions in REST-based APIs can be handled by optimistic concurrency control. For this purpose, requests are sent as conditional requests. A conditional request specifies the If-Match or If-Unmodified-Since header in the request. The value of the IfMatch header is the value of the ETag header of the resource as retrieved last time by the requesting client. The value of the If-Unmodified-Since header is the value of the LastModified header of the resource as retrieved last time by the requesting client. When performing the conditional request, the API implementation has to compare the value(s) passed by the client in the request with the corresponding values of the current resource as stored by the origin server (note that If-Match takes precedence over If-Unmodified-Since). If the values have been changed, the request is rejected (response code "412 Precondition Failed"); otherwise, the request is executed.

Example: PUT /products/31415
If-Match: "42049dcaf450987cffd"
If-Unmodified-Since: Thu, 7 Jan 2016 17:41 CET

Response: HTTP/1.1 412 Precondition Failed

In order to send conditional requests, the requesting client either has to cache these values after having retrieved the resource before, or has to retrieve these values by a GET on the resource that is performed before the conditional request is made. The disadvantage of the latter is that a second request has to be made.

Example: GET /products/31415
...

Response: HTTP/1.1 200 OK
ETag: "4562aae7732a56"
Last-Modified: Wed, 6 Jan 2016 11:13 CET
...

Even with PATCH, special care must be taken with respect to lost updates: if concurrent PATCHes are to be supported, concurrency control has to be implemented as a conditional request.

A similar pattern may be used to realize partial updates without using processing function resources (section 4.5): a client has to GET the resource to be updated, apply the partial updates locally, and then send a conditional PUT with the complete resource to the server.

10.6 Long Running Requests

Requests may take too long to return the response synchronously, i.e. such requests have to be processed asynchronously. For example, a request like applying for a credit (see next example) may kick-off a workflow involving human beings that will take some time. In this case, the API will accept the request and return both, the URI of a so-called task resource as well as the task resource itself in the response body. The task resource is a document like in the following example, that contains the actual status of the long running request. The URI of the task resource is passed in the Content-Location header field of the response. This URI can be used by the client to poll for the processing state of the long running task later on.

Example: POST /apply-for-credit HTTP/1.1
...

Response: HTTP/1.1 202 Accepted
Content-Type: application/xml
Content-Location: http://www.shark-credits.com/apply/tasks/1 

< status >
 < state > running </state>
 < link rel="self" href=".../tasks/1"/ >
 < estimatedCompletion> 2020-04-01 </estimatedCompletion>
</status>

The response message specifies, by the "202 Accepted" status code, that the request is accepted but will be processed asynchronously. The task resource says that the request is running and gives an estimated completion time, amongst other appropriate information (note, that there is no standardized format of a task resource). In succeeding requests, the client will retrieve the actual task resource via the URI value of the Content-Location header field:

GET /apply/tasks/1 HTTP/1.1
Host: www.shark-credits.com
...

The sample response succeeds successfully ("200 OK"), which means that the retrieval of the task resource was successful - note especially that this does not indicate that the long running request completed successfully:

HTTP/1.1 200 OK
Content-Type: application/xml
Content-Location: http://www.shark-credits.com/apply/tasks/1

< status >
 < state > running </state>
 < link rel="self" href=".../tasks/1"/ >
 < estimatedCompletion> 2020-10-10 </estimatedCompletion>
</status>

After some time, the long running request will have completed, i.e. the GET request above will receive the following response:

HTTP/1.1 303 See Other
Content-Type: application/xml
Location: http://www.shark-credits.com/apply-for-credit
Content-Location: http://www.shark-credits.com/apply/tasks/1

< status >
 < state > ready </state>
 < link rel="self" href=".../tasks/1"/ >
 < message> Image processed & stored </message>
</status>

The status code "303 See Other" specifies that a new resource has been created with the URI with the value of the Location header field. This is the URI of the result of the long running request. The task resource now reports in its state field that the long request succeeded successfully.

The long running request may fail. In this case, the retrieval of the task resource will succeed with a "200 OK" status code and the task resource's status will report that the long running request completed but was not successfully. Thus, it must be kept in mind, that the status code of the response of the retrieval of the task resource is not related to the status of the long running request at all.

HTTP/1.1 200 OK
Content-Type: application/xml
Content-Location: http://www.shark-credits.com/apply/tasks/1

< status >
 < state > FAILED </state>
 < link rel="self" href=".../tasks/1"/ >
 < estimatedCompletion> 2020-10-10 </estimatedCompletion>
</status>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.