API Reference

This section provides documentation for the main APCloudy classes and methods.

APCloudyClient

class apcloudy.APCloudyClient(api_key: str = '', settings=None)[source]

Bases: object

Represents a client for interacting with the APCloudy API.

This class provides methods to make authenticated HTTP requests, manage projects (e.g., retrieval, creation, and listing), and validate the connection with the APCloudy API. The client supports additional features such as retry logic for transient errors and rate-limiting compliance.

Variables:
  • api_key – The API key is used for authentication with the APCloudy API.

  • base_url – The base URL for the APCloudy API.

  • session – The session object used to handle HTTP requests.

__init__(api_key: str = '', settings=None)[source]

Initialize APCloudy client

Parameters:
  • api_key – Your APCloudy API key

  • settings – Scrapy settings object (optional)

http_request(method: str, endpoint: str, **kwargs) Dict[str, Any][source]

Make API request with retry logic

Parameters:
  • method – HTTP method

  • endpoint – API endpoint

  • **kwargs – Additional arguments for requests

Returns:

API response data

Return type:

Dict

get_project(project_id: int) ProjectManager[source]

Get a project manager for the specified project

Parameters:

project_id – Project ID

Returns:

Project manager instance

Return type:

ProjectManager

list_projects() Project[source]

List all projects

Returns:

Available projects

Return type:

List[Project]

create_project(name: str, description: str = '') Project[source]

Create a new project

Parameters:
  • name – Project name

  • description – Project description

Returns:

Created project

Return type:

Project

Main Methods

Project Operations

  • get_projects() - List all projects

  • get_project(project_id) - Get a specific project

Spider Operations

  • get_spiders(project_id) - List spiders in a project

  • get_spider(project_id, spider_name) - Get a specific spider

Job Operations

  • get_jobs(project_id) - List jobs for a project

  • get_job(job_id) - Get job details

  • start_job(project_id, spider_name) - Start a new job

  • stop_job(job_id) - Stop a running job

Models

Job State

class apcloudy.models.JobState(value)[source]

Represents the state of a job in a task or workflow management system.

This enumeration is used to define and manage the state of a job. It supports several states a job may transition through during its lifecycle, which can facilitate job tracking, control, and monitoring in various systems. Typical states include when a job is scheduled, actively running, completed, or deleted.

SCHEDULED = 'scheduled'
RUNNING = 'running'
COMPLETED = 'completed'
DELETED = 'deleted'

Job

class apcloudy.models.Job(job_id: str, spider_name: str, state: ~apcloudy.models.JobState, project_id: str = '', created_at: ~datetime.datetime | None = None, started_at: ~datetime.datetime | None = None, finished_at: ~datetime.datetime | None = None, items_scraped: int = 0, requests_made: int = 0, job_args: ~typing.Dict[str, ~typing.Any] = <factory>, units: int = 1, logs_url: str | None = None, items_url: str | None = None)[source]

Represents a job execution and maintains information related to job lifecycle, metrics, and associated resources.

This class is used for tracking the progress, state, and details of a specific job. It can manage metadata such as creation time, start time, finish time, and other attributes that describe the job’s execution process.

Variables:
  • job_id – Unique identifier for the job assigned by the system.

  • spider_name – Name of the spider used to execute the job.

  • state – Current state of the job, represented as a JobState instance.

  • project_id – Identifier of the project the job belongs to.

  • created_at – Timestamp when the job was created, or None if not available.

  • started_at – Timestamp when the job was started, or None if not available.

  • finished_at – Timestamp when the job was finished, or None if not available.

  • items_scraped – Total number of items successfully scraped by the job.

  • requests_made – Total number of requests made during the job execution.

  • job_args – Dictionary of additional arguments or configuration parameters passed to the job.

  • units – Number of resource units used (e.g., processing capacity) by the job.

  • logs_url – URL containing logs associated with the job, or None if not set.

  • items_url – URL containing scraped items for the job, or None if not set.

job_id: str
spider_name: str
state: JobState
project_id: str = ''
created_at: datetime | None = None
started_at: datetime | None = None
finished_at: datetime | None = None
items_scraped: int = 0
requests_made: int = 0
job_args: Dict[str, Any]
units: int = 1
logs_url: str | None = None
items_url: str | None = None
classmethod from_dict(data: List[Dict[str, Any]]) List[Job][source]

Creates a Job instance from a dictionary representation and displays the job details in a tabulated format.

This method is primarily responsible for deserializing structured data into a Job instance and setting attributes accordingly. Additionally, it formats and prints job details like job ID, spider name, state, and timestamps in an organized layout.

Parameters:

data (Dict[str, Any]) – Dictionary containing the job data.

Returns:

A Job instance populated from the given data.

Return type:

Job

property duration: float | None

Get job duration in seconds

__init__(job_id: str, spider_name: str, state: ~apcloudy.models.JobState, project_id: str = '', created_at: ~datetime.datetime | None = None, started_at: ~datetime.datetime | None = None, finished_at: ~datetime.datetime | None = None, items_scraped: int = 0, requests_made: int = 0, job_args: ~typing.Dict[str, ~typing.Any] = <factory>, units: int = 1, logs_url: str | None = None, items_url: str | None = None) None

Spider

class apcloudy.models.Spider(name: str, description: str = '', project_id: str = '', settings: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Represents a spider

name: str
description: str = ''
project_id: str = ''
settings: Dict[str, Any]
classmethod from_dict(data: List[Dict[str, Any]]) List[Spider][source]

Create Spider instance from API response

__init__(name: str, description: str = '', project_id: str = '', settings: ~typing.Dict[str, ~typing.Any] = <factory>) None

Project

class apcloudy.models.Project(project_id: str, org_name: str, name: str, description: str = '', created_at: datetime | None = None, spider_count: int = 0, job_count: int = 0)[source]

Represents a project

project_id: str
org_name: str
name: str
description: str = ''
created_at: datetime | None = None
spider_count: int = 0
job_count: int = 0
classmethod from_dict(data: Dict[str, Any]) Project[source]

Create Project instance from API response

__init__(project_id: str, org_name: str, name: str, description: str = '', created_at: datetime | None = None, spider_count: int = 0, job_count: int = 0) None

Exceptions

APCloudy specific exceptions

exception apcloudy.exceptions.APCloudyException[source]

Base exception for APCloudy operations

exception apcloudy.exceptions.APIError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]

Raised when API returns an error response

__init__(message: str, status_code: int | None = None, response_data: dict | None = None)[source]
exception apcloudy.exceptions.AuthenticationError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]

Raised when authentication fails

exception apcloudy.exceptions.JobNotFoundError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]

Raised when a job is not found

exception apcloudy.exceptions.ProjectNotFoundError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]

Raised when a project is not found

exception apcloudy.exceptions.SpiderNotFoundError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]

Raised when a spider is not found

exception apcloudy.exceptions.RateLimitError(message: str, status_code: int | None = None, response_data: dict | None = None)[source]

Raised when rate limit is exceeded