For all these examples you will need you API access token which will be normally emailed to you when you start your trial or subscription. The token will be a hexidecimal string (e.g. 0607107d6165e994e0c3c5b470c93cb801f6180a). You can obtain a free trial token here.

Web

  • Go to WebCookies.org API Swagger
  • Click the Authorize button in the right top corner
  • Enter Token 0607107d6165e994e0c3c5b470c93cb801f6180a in the api_key input field and click Authorize
  • You are now authorized and all API methods with their documentation are displayed
  • For quick start, look at the /api2/urls/ method and try it with an URL of your choice

If you get errors your token may be not yet or no longer valid. Please contact us for support.

Curl

curl -X POST --data '{ "url": "https://httpbin.org/cookies/set?hello=world" }' \
-H Content-Type:application/json \
-H "Authorization: Token 0607107d6165e994e0c3c5b470c93cb801f6180a" \
https://webcookies.org/api2/urls/

If the URL is already in the database, a no-op response will be returned:

{'message': 'URL already scanned', 'url': 'https://httpbin.org/cookies/set?hello2=world', 'url_id': 3128245}

To refresh the results for this URL (re-run the scan) add an rescan=true parameter to the URL:

curl -X POST --data '{ "url": "https://httpbin.org/cookies/set?hello=world" }' -H Content-Type:application/json  -H "Authorization: Token 0607107d6165e994e0c3c5b470c93cb801f6180a"  https://webcookies.org/api2/urls/?rescan=true

The URL will be queued for processing and the following response returned:

 {'message': 'URL successfully queued for processing', 'task_id': '3c8fd06f-c54a-40a0-9597-31b4418a5de6', 'url': 'https://httpbin.org/cookies/set?hello2=world', 'url_id': 3128245}

The task_id parameter can be then used to query scan status and fetch results:

curl -X GET --header 'Authorization: Token 0607902d6065e994e0c3c5b570c93cb801f6280a' 'https://webcookies.org/api2/task-status/857004c2-5831-4702-9d7d-df20045c4930'

While the URL is being scanned the endpoint will return the following code:

 {'task_id': '3c8fd06f-c54a-40a0-9597-31b4418a5de6', 'status': 'SUCCESS', 'metadata': 'True'}

When the task is completed successfully this API will return the SUCCESS status:

{'task_id': '3c8fd06f-c54a-40a0-9597-31b4418a5de6', 'status': 'SUCCESS', 'metadata': 'True'}

Previously returned url_id can be now used to retrieve the scan results for this URL:

 curl -X GET --header 'Authorization: Token 0607902d6065e994e0c3c5b570c93cb801f6280a' 'https://webcookies.org/api2/urls/3128245/'

Returned JSON structure contains identifiers of objects such as cookies that can be retrieved using further API calls as documented on the WebCookies.org API Swagger pages:

 {"id":3128245,"date_fetched":"2017-03-09T16:17:27.117932Z","status":{"code":200,"details":null},"httpcookie_set":[20735303],"flashcookie_set":[],"localstoragecookie_set":[],"sessionstoragecookie_set":[],"canvastracker_set":[],"httpheader_set":[3653390,3653389],"adultrating_set":[],"clientaccesspolicy_set":[],"sslyzescan":null,"crossdomain_set":[],"url":"https://httpbin.org/cookies/set?hello2=world"}

The scan may be in progress (the PENDING state) for a minute or so usually. Please contact us if you feel this takes too much time or get any FAILED responses for pages that you believe are working normally or if you experience problems around your token being not recognized, such as in this response:

{"detail":"Invalid token."}

Python

Using the classic requests HTTP client library:

#!/usr/bin/python3

import requests
import time

TOKEN='Token 0607902d6065e994e0c3c5b570c93cb801f6280a'

headers = {'Content-Type': 'application/json', 'Authorization': TOKEN}
data = {'url': 'https://httpbin.org/cookies/set?hello2=world'}

# try to add the URL to WebCookies database for scanning
r = requests.post('https://webcookies.org/api2/urls/', headers=headers, json=data)
print('Scan', r)
print(r.json())

If the URL is already in the database the API will return 409 Conflict HTTP code and JSON response with detailed information:

Scan <Response [409]>
{'message': 'URL already scanned', 'url': 'https://httpbin.org/cookies/set?hello2=world', 'url_id': 3128245}

The returned url_id can be used straight away to fetch the previously collected results (/api2/urls/{}/) or you can force rescan of the URL using /urls/?rescan=true query parameter:

# possible responses:
# code 201 Created - the URL was added to database
# code 409 Conflict - the URL is already in the database,  in such case we will just force rescan
if r.status_code == 409:
    r = requests.post('https://webcookies.org/api2/urls/?rescan=true', headers=headers, json=data)
    assert r.status_code == 201
    print('Rescan', r)
    print(r.json())

The API returns 201 Created status code if the URL was new or rescan was forced. The JSON response also contains task_id that can be used to query scan task status:

Rescan <Response [201]>
{'message': 'URL successfully queued for processing', 'task_id': '3c8fd06f-c54a-40a0-9597-31b4418a5de6', 'url': 'https://httpbin.org/cookies/set?hello2=world', 'url_id': 3128245}

Task status will be usually one of SUCCESS, PENDING or FAILED:

# task_id is a short-lived identifier for the current scan task
task_id = r.json().get('task_id')
# url_id is a long-lived identifier for the URL
url_id = r.json().get('url_id')

# wait for completion
while True:
    r = requests.get('https://webcookies.org/api2/task-status/{}'.format(task_id), headers=headers)
    print(r.json())
    status = r.json().get('status')
    print('Status', status)
    if status == 'PENDING':
        time.sleep(10.0)
    if status == 'FAILED':
        print(r.json())
        break
    if status == 'SUCCESS':
        print(r.json())
        break

Example responses:

{'task_id': '3c8fd06f-c54a-40a0-9597-31b4418a5de6', 'status': 'PENDING', 'metadata': None}
Status PENDING
{'task_id': '3c8fd06f-c54a-40a0-9597-31b4418a5de6', 'status': 'PENDING', 'metadata': None}
Status PENDING
{'task_id': '3c8fd06f-c54a-40a0-9597-31b4418a5de6', 'status': 'SUCCESS', 'metadata': 'True'}
Status SUCCESS

After a successfully completed scan the task_id can be used to fetch results:

# fetch results
r = requests.get('https://webcookies.org/api2/urls/{}/'.format(url_id), headers=headers)
print(r.json())

The response contains references to various cookie-like objects which can be fetched using other API methods:

{'canvastracker_set': [], 'flashcookie_set': [], 'status': {'details': None, 'code': 200}, 'httpcookie_set': [20735303], 'adultrating_set': [], 'sslyzescan': None, 'sessionstoragecookie_set': [], 'localstoragecookie_set': [], 'date_fetched': '2017-03-09T16:17:27.117932Z', 'id': 3128245, 'crossdomain_set': [], 'httpheader_set': [3653390, 3653389], 'url': 'https://httpbin.org/cookies/set?hello2=world', 'clientaccesspolicy_set': []}
Fully automated RESTful API is now available. Subscribe for your free trial today!