Notes: Programming Google App Engine_GAE编程指南书评-查字典图书网

## Introduction
* 3 parts: application instances, data storage, scalable services.
* Application’s perspective: the app engine provides:1) storage between requests 2) able to distribute traffic among different servers 3) resource(cpu/memory/server) scalability.
* App engine’s perspective: 1) create/destroy application instances as needed 2) sandbox
* Supports Go/Java/Python
* db: use transaction
* Memcache/storage system
* Send/receive messages (mail/XMPP)
* Search

## Configuring an Application
* Py27: declare your code is thread safe by `threadsafe: true`. This is the recommended way (*compare*: conroutine, not thread-safe)
* Authorization:
```
handlers:
- url: /account/.*
  script: account.py
  login: required
```
App Engine configuration-based authorization (*compare*: App Engine api-based authorization (more fine-grained?))
* Service: under `/_ah/`
* Admin console (*compare*: application shell in bridge)
* Environment: Python, virtual env. Includes:
1) Standard library
2) Libraries/tools in App Engine SDK (like API for accessing services).
3) Other 3rd-party library the application use

## Request Handlers and Instances
* App Engine does load balancing / instance scaling for application, you can ignore instances and focus on request handlers
* Runtime environment: sandbox:
1) App cannot spawn additional process
2) App cannot make arbitrary network connections
3) App can only read its own part of the filesystem
4) App cannot see other applications/processes running on the server
* GAE Sandbox implementation: replacing standard library calls (and other methods?)
* Limitations: requests(response time/size of request, etc)/service(datastore/memcache)/deployment(resource files)
* Request handler has a pool of instances, App Engine starts/shuts down instances by need, requests are routed to instances on availability (GAE instance scale(*how? :”If all instances are busy, App Engine starts a new instance*) vs. DAE workers scale)
* Support multithreading in Instance
* Instance scale: starts new instance when all instances are busy.
* Instance busy(multithreading disabled): instance presently busy handling a request.
* Instance busy(multithreading enabled):
1) Current load (cpu/memory) from active request handlers
2) Historical load by previous requests

## Datastore Entities
* 2011-12: master/slave datastore -> high replication datastore (no scheduled maintenance)
```
class Book(db.Model):
        title = db.StringProperty()
        author = db.StringProperty()
        copyright_year = db.IntegerProperty()

obj = Book(tital="", ..)
obj.put()

# Query
q = db.Query(Book)
q.filter('copyright year >', 2015).order('-title')
```

## Large Data and the Blobstore
* Unlimited file size, but API to Blobstore are limited to 32 megabytes.

## Fetching URLs and Web Resources
* URL fetch service by GAE, based on Google infrastructure.
* Overrides urllib/urllib2/httplib

## Task Queues and Scheduled Tasks
**Reason**
* [DAEMON] Updating an element of data may require several related but time-consuming updates,.
* [MQ] It’s often acceptable to record what work needs to get done, respond to user right away, then to the work later.
* [CRON] Scheduled updating/analysis

**MQ**
* Producer: enqueues task; Consumer: a process, separate from the producer, leases tasks on the queue.
* Operation: push/pull queues
* Enabled when you deploy your application
* Configuration file:
```
queue:
- name: default
  rate: 10/s
  mode: pull # pull/push
  total_storage_limit: 200MB # data in queue
```

*Enqueue*
```
from google.appengine.api import taskqueue
taskqueue.add(queue_name='name')
```

* Feature: countdown: wait seconds before start
* Feature: ETA: expected start time, but tasks may be delayed.
* Feature: delete/retry/purge/find-.
* Feature: task chain: produce another task during task execution
* Task handlers are mapped to URLs like `/_ah/queue/name` (user requests are blocked for these paths), and report return status by sending http status.
* Handlers run in separate threads, and use the same scaling mechanism as user requests.
* No separate MQ logs, merged in application requests logs.
* Tasks queues may have different version and corresponding handlers.

**Scheduled Tasks (aka cron jobs)**
* App engine calls the URL by an empty Get request regularlly.
* Has deadline, no retry.

## Optimizing Service Calls
* Async calls for datastore/mc/URL fetch
```
rpc1 = urlfetch.make_fetch_call(urlfetch.create_rpc(), url1)
rpc2 = urlfetch.make_fetch_call(urlfetch.create_rpc(), url2)

# sync point, wait for the longest rpc call
combine(rpc1.get_result(), rpc2.get_result())

# other sync points
# in progress -> ready
rpc.wait()

# ready -> checked
rpc.check_result()
```
* Async db/mc
```
rpc1 = db.get_async(k)

rpc2 = mc_client.get_multi_async(k)
```
* Callbacks
```
self.rpc = urlfetch.make_fetch_call(urlfetch.create_rpc(), url)
self.rpc.callback = self.process_results

def process_results(self):
        results = rpc.get_result()
```

## Deploying and Managing Applications
* Deploying / reverting / inspecting performance / analytic graphs of traffic and resource usage (like DAE bridge)

Notes: Programming Google App Engine

您对该书评有什么想说的？

推荐文章

猜你喜欢

附近的人在看

推荐阅读

拓展阅读