多站点网站身份验证，如mint.com

How would one go about creating a site that will log you into other sites and gather your data. For instance, how mint.com allows you to input all your online bank details and it gathers your data for viewing within Mint.

如何创建一个将您登录到其他站点并收集数据的站点。例如,mint.com如何允许您输入所有在线银行详细信息,并收集您的数据以便在Mint中查看。

If someone could point me in the direction with some keywords or any scripts, it would be much appreciated.

如果有人可以通过某些关键字或任何脚本指向我,那将非常感激。

2 个解决方案

#1

This really depends on what you are wanting to do. For example, Mint.com leverages, or did at one point in time, an SDK from a company called Yodlee. This SDK/Library uses a screen scraping technology to acquire the data on behalf of Mint.com's customers.

这实际上取决于你想做什么。例如,Mint.com利用或者在某个时间点从一家名为Yodlee的公司获得SDK。此SDK / Library使用屏幕抓取技术代表Mint.com的客户获取数据。

#2

In general you need to automate site access and parsing, aka scraping. There are usually two tricky areas to watch out for: 1) authentication 2) whatever you're scraping will typically require you to inspect its HTML closely while you determine what you're trying to accomplish.

通常,您需要自动化站点访问和解析,即刮擦。通常需要注意两个棘手的方面:1)身份验证2)无论您正在抓取什么,通常都需要您在确定要完成的任务时仔细检查其HTML。

I wrote a simple ruby app which scrapes and searches Apple's refurbished store a while back that you can check out here as an example (keep in mind it could certainly use improvement, but may get you going):

我写了一个简单的ruby应用程序,它在一段时间内搜索和搜索Apple的翻新商店,你可以在这里查看作为一个例子(请记住,它肯定可以使用改进,但可能会让你去):

http://grapple.xorcyst.com

I've written similar stuff to grab data from my bank accounts (I'm not too keen on giving mint my credentials) using mechanize and hpricot, as well as job sites, used car dealerships etc, so it's flexible if you want to put in the effort.

我写了类似的东西来从我的银行账户中获取数据(我不太热衷于提供薄荷我的凭据)使用机械化和hpricot,以及工作地点,二手车经销商等,所以如果你想放置它是灵活的在努力。

It's a useful thing to do, but you need to be careful not to violate any use policies and the like.

这是一件很有用的事情,但是你需要注意不要违反任何使用政策等。

Here's another quick example that grabs job postings to show you how simple it can be

这是另一个抓住工作发布的快速示例,向您展示它的简单性

#!/usr/bin/ruby

require 'rubygems'
require 'mechanize'
require 'hpricot'
require 'open-uri'

url = "http://tbe.taleo.net/NA2/ats/careers/jobSearch.jsp?org=DIGITALGLOBE&cws=1"
site = WWW::Mechanize.new { |agent| agent.user_agent_alias = 'Mac Safari' }
page = site.get(url)

search_form = page.form("TBE_theForm")
search_form.org = "DIGITALGLOBE"
search_form.cws = "1"
search_form.act = "search"
search_form.WebPage = "JSRCH"
search_form.WebVersion = "0"
search_form.add_field!('location','1') #5
search_form.add_field!('updatedWithin','2')

search_results = site.submit(search_form)
doc = Hpricot(search_results.body)

puts "<b>DigitalGlobe (Longmont)</b>"

doc.search("//a").each do |a|
  if a.to_s.rindex('rid=') != nil
    puts a.to_s.gsub('"','')
  end
end

#1

#2