XML for Analysis(XMLA)开发详解-(4)XMLA over TCP/IP访问Analysis Services 2005/2008 Olap的释疑

时间:2022-07-03 17:24:56

目录

1、综述(简要回顾一下Olap开发接口)

2、XMLA概念(基本来自联机手册,介绍一些概念,如果熟悉概念可以略过)

3、XMLA配置(针对各个版本的SSAS、IIS的配置说明)

4、XMLA over TCP/IP(说明一下为什么SSAS内置的XMLA over TCP/IP在开发时行不通)

5、 XML for Analysis(XMLA)开发详解-(5)XMLA请求的格式及实例示意

6、元数据访问接口Discover方法(获取目标OLAP服务的结构性信息及属性信息)

7、数据查询接口Execute方法(执行MDX语句查询)及全文总结

XML/A over TCP/IP作为SQl Server Analysis Service 2005/2008自身客户端(SQL Server Management Studio/Business Intelligence Development Studio)和服务器端通讯的基本协议,却无法直接在我们第三方开发中使用。    因为XML/A over TCP/IP是微软的保留曲目,是专利保护的技术,目前只供自己使用。本文作者对该协议进行了详细分析,感兴趣的可以看看。但实际研发中,相当长一段时间我们如果想采用XML/A都必须通过XML/A over HTTP,配合IIS扩展使用。   对技术细节感兴趣同仁,不防通过Google专利搜索工具(http://www.google.com/patents)获取关于XML/A over TCP/IP的专利全文学习.

 

关于XMLA Over TCP/IP的讨论地原文如下: 

 

It has been widely publicized that Analysis Services 2005 uses XML for Analysis 1.1 (XMLA) as its only network protocol for communication between client and server. XMLA is standard public protocol defined by XMLA council and full description can be found at www.xmla.org. However, while reading it, it becomes immediately clear, that the standard only discuss client server communication over HTTP only. Indeed, here is a small excerpt from the standard - "The specification is built upon the open Internet standards of HTTP, XML, and SOAP". XMLA uses SOAP and SOAP has standard binding to HTTP through the use of HTTP headers. However, while AS2005 can be configured to work over HTTP, the typical way of connecting to AS2005 is directly through TCP/IP. There is no standard binding of SOAP to TCP/IP in general, and in particular XMLA standard says nothing about it. So what we are going to do is to reveal piece by piece and layer by layer what happens on the wire when Analysis Services client (either OLEDB, ADOMD.NET or AMO) works with Analysis Services server through TCP/IP. By doing so, we will discover and discuss several important performance and security tradeoffs which are useful for tuning performance and for secure maintaining the server as well as for troubleshooting. I also have to disclose, that the TCP/IP protocol to AS is patented technology. It is not secret - because patents are public information - but it cannot be freely copied or reimplemented by 3rd party.

I will structure this article in a form of detective story. Armed with network monitor, we will intercept the communications and will analyze it. For the purpose of experiment, we will be studying the response MDX query sent to the Foodmart 2000 Sales cube migrated to Analysis Services 2005. I am using "MDX Sample" application which shipped with AS2000 instead of SQL Management Studio which shipped with AS2005. MDX Sample uses ADOMD which uses OLEDB as opposed to SQL Management Studio which uses ADOMD.NET. This is actually not the reason why I need to use MDX Sample, since both OLEDB and ADOMD.NET use exactly same wire protocol. The reason is that we will need to change some connection string properties while doing some experiments, and unfortunately it is not possible with SQL Management Studio.

So let's start by sending the following simple MDX query and watching what will come on the wire

SELECT Measures.MEMBERS on COLUMNS FROM Sales

What we get back doesn't look like XMLA message at all (here and below the non-printable binary characters are substituted with dot character)

....application/sx+xpress...e......d...Uy.8k..P...a:|.>.Oo..+.....M/.h6..a+..........e...p:..&..%B..R..9.J7'......F-..z...,..#....y..6X.gx..*.....d.....D....k.m..C...T........_......>z#.L....4....j...Z....~.c..tl..3...%..H.....RQ......f%.fn....oR./.%..g.X..o..u.`.}.{....~..~X.7Is...-...P.5.2... .+Vm'.3.LU.W....&.......{.}.....6h..W..C.PD..2lM-....v<.t....._..f0g..n.m.?.". .S33.....F.............2(.......6{......'.......`.....2.gCc..I.&..../..l).<....>v.m.O..d..*.&)...-....M......q6...s./..A....P4..*..K..4.=....c...z...,.Q.b]9U........y'.i.."N}...noi..K..r.lptl."3......cw...Y..R5.Q.H...1.........#...h..A....28}o.XC..Gu...(...N......s..~....{...5|Q0...W..U.i8su..!.J.Qs..K%....*.;9.i3....H......[7...tc......;.......]..U..t.v...;v.lB...$...(z..D.......&..1...P.5R.of..b.:2..(.....Z1..zfY.n.6+../s}...ND../.0.UD..p.|.!....../fQ.........0.S..N..v->..}..w..<.c"2t.......p.t{..1.}....88.A............&..i8x.z.vDQ.....<.Q'.D[;HJ.....,...1`d/...i.x...G...E.P.O...?.^..8/.!.!.?..Nr2.ONa.|..JE...B....Q..f...R.1.d8/.)....H_..Rn#....._..@.k$>.Ad...4'b..>....w.B..t..j.#...../..4mFY.Z/.* -......f.>.7R.~.j.by9.../......un.._......t...[.P........."..S.3....J....'N.ne!T..@.......G...9.....F....)xA3.v... N..c.....W}.f... G./Rf..6.m.8.qB$....`...K..I....FV.6*..68..z...~..........!8..Ta.t..'.L.f.$W....]....6oF..-....S..`6........'b/`W.o<......cQfR#..r..........kLY............d......e.....I....j=3.ppf..0..!pn..u.....(.+...........=.....qo.tgu....,....e#..[.5-..aY7.........N.!....M.........B.b`aJ9M|.!6....k..6..i.......6U......x....F...x.-....9.uo....{.g..p.n.m.....'I..n.[V..;.g<..7.....g.S|..T(YW....4X6bL..]...}...........{.`.1.E.5...(....a....#.Y....OEM"...e..|.e..*.>.*@nn...w=0..C@...wD...#..5v..8O.[.....uuB..'. i.....j..?.."....W../..-m...%..x.c........0.5.u...i..Q..9............vB......P.q=......Y.j..L.j.j...$.C..NU........4P..7b.....&..;.u.h..WE4T..1.$.Q.<*..m...s1F.........ri18.G.k....F....j..ey0.]...j....V.CUZ.J.O.VE.....~o...;.>....E#......4. .E.l..,...7..E..........X.:[...8...~5.X.U....p.E.Of..L...q.x..j~.b.`.....KB..+......B.y^l...>...I<.b&.E.....ED..3..%b.......k.DM"..FB..#I|....u...A5........y..!....o..:..=g.....(4....7.......#kG...s.$"...~.?..6....9%'FrE.......p..c...F...~....i..|U....x...].(..=C.|;u.EL.(.{.;...<..U/ .3...OG....v.;57...:.......&....._g......_pW.......D9.!..#.....'j..J%.c.Z|........qo...n6[g..Km"...h......G...1.lcsDo#.....o..E.'.Em..E.........F....F..r...A........AA.R.U!......O....x._..W..k..........e...... t.&E.KBt..U.K....A..E$...2i&...4.J./].o......UM...i..E.......h.{..u...JKb....s-....1.G$n...p.|.2_9.;.........%..cB..]..d.p.....].'..6/K^..].n.]o}.gQ.D..T.W.>g.xYy...3...X...y.{...6;.j3...-i......:...D....E!..5.....3.......q.AU}..=.....E..Z.T7.[.P..q..Y.......M.[...V..e.... ....N.S.A..TN.......#4.G.A....?......9.9.#..{qS..IB...`B.....@...D.Ps.4.;..M.....$......?4[.X...~..!w@.....!. =DU.h.n

First few bytes where it says "application/sx+xpress" look meaningful, but everything else appears to be complete garbage. We will come back to the explanation of what the "application/sx+xpress" string means later. For now let's decipher the rest of the message. One hint into what is happening here is if you will try to send exactly same query to exactly same cube - you will see completely different content. This is because what we actually see on the wire - is encrypted content. That's right, by default AS2005 encrypts (and also digitally signs) all the network messages. You may have heard about Trustworthy Computing initiative which swept the entire Microsoft - well this is just one of the hundreds design changes that were made to the product. For encryption and digital signing of network messages AS uses encryption algorithm from the SSPI provider used for network authentication. By default in AS2005 the SSPI package is "Negotiate", which usually resolves itself to either Kerberos or NTLM, depending on how network is set up. In both of these cases encryption is done by symmetric encryption algorithm using the session key which is negotiated during authentication. The performance impact on both server and client CPUs for doing encryption/decryption is small - in my experiments it was below 5%, so it is usually a good choice to continue to have everything encrypted. In some scenarios, however, when response sizes are large and server is very loaded and the network is trusted, it may be beneficial to disable encryption or digital signing or both to improve performance. The level of encryption is controlled by the standard OLEDB property DBPROP_INIT_PROTECTION_LEVEL. Below is the excerpt from OLEDB documentation:

Indicates the level of protection of data sent between client and server. This property applies only to network connections other than RPC connections; these protection levels are similar to those provided by RPC. One of the following:
  • DB_PROT_LEVEL_NONE — Performs no authentication of data sent to the server.
  • DB_PROT_LEVEL_CONNECT — Authenticates only when the client establishes the connection with the server.
  • DB_PROT_LEVEL_CALL — Authenticates the source of the data at the beginning of each request from the client to the server.
  • DB_PROT_LEVEL_PKT — Authenticates that all data received is from the client.
  • DB_PROT_LEVEL_PKT_INTEGRITY — Authenticates that all data received is from the client and that it has not been changed in transit.
  • DB_PROT_LEVEL_PKT_PRIVACY — Authenticates that all data received is from the client, that it has not been changed in transit, and protects the privacy of the data by encrypting it.

In OLEDB connection string as well as in ADOMD.NET this property is mapped to string "Protection Level". Analysis Services supports the following values:

  • "none" - however by default server will not accept non-authenticated clients, so specifying this value will fail the connection
  • "connect" - only authentication is done, no encryption or digital signatures
  • "pkt integrity" - messages are digitally signed but not encrypted
  • "pkt privacy" - messages are both digitally signed and encrypted

The "call" and "pkt" values are not supported. As we mentioned before, "Protection Level=pkt_privacy" is the default, but now we will disable it by explicitly specifying "Protection Level=connect" in the connection string. Now after executing the very same MDX query, we will get different result.

....application/sx+xpress...w:..]..............1...0...U.T.F.-.1.6...)h.T..At..p.:./..s.c.h.e.m.aX...xH.lH.ox.px.o...*.r.g:.../X.n.v(.l..p8./.......E..................n8.:..k...........BX.d.y.........&u..n.........-8.i..rp...f@.-x.o..:..-H.n..lp.sh.s...E..e....8.ux.e.R..s..o8.sX..................../.5.././r..t...........0s....-:..d..ah.).q.....oH........V.n.....I.k.Q....sH..........w......3(.].2@.0../.X.M.....L.S(.=.-X...tX.n..eo.dh...... o..'.. ..'......l-@.g...........!.r....NX.mH...aX.........w.I.a.....X...t.#..m.Dx.f(.u...........qx...i..i..dH....J.k...VE.'........ ..?'.....p..e..T@.pH. ........my...#.........MX.m.b(.r^....s0.q.u(.............&+b.........e./......#.#8..._...m..n.Oh.cH.r.............0..a.........u..b .9.dH.d...p..oH...sW@.-.CX.n....tx........s.k..p........t..i`.S*............H......c...y.#t(...`.....8.s0.....n.g...........P..q0........DX....`......I.A.v......ux.........,}{.[q@.i.....,...(..+.Q..Q....T.........'."......e...,..........}s........G..Z.....p......c..._.s..U.......o.E.7./....gx...0D..Z... .....S..t.......o..cp.0.!.......s.........z........j....[..C.......dX.c...(...L..&G....U..i..n8....Z[.(.....f.."........4............S@.z..h......s..g8.I.I8.t..3.x....O0.1O..f.o.....C..y..........W.F...E...7.)..i..L(./.....D8.t..U...2..........T..........1?.....j..A.1Ts........]..._................3.G._..0k..+...!....B.P....C0.l.......o......:.......]...!.]...a.?.i.w...........D........w.........9"....O..d....x...#.............+o8.... .P'......c8.K.X....O'v.............H._.......D..a.(.t..H............ o#.....S)..$......x.%..?......&..b.. ..Sx..7..O*...'.!.r9w.DG`.8.!..).E`.x.o../..n..l...W..`s..r......5.3H.YPg............/ .W-..@.(."G._.a...@.. 7........).#......*.$......0...//.x.+..%..T...+..[h.e..s..a.s.]...U?.,..&P..W......E..B(.R...."_...I.Q8.Eh.N.A..EH........dp....a.......C..p-f....+G..-.'........C..P.(I.Ox.O....L......((..y./.L...w.ZV..L.......Y.uX.H./.)H..O..N............%...DX..*.......U.0.*?..DX.SX.L..YX.Ih.F..:..=..*.A"....S..C.r8.{.k..[.3....tP.r..8..A.a.......1.+..?....[...i...M..............j.........&../.............../....&...[..;...+..m...........m...............c.{.o.&.....................O+.[...4d@.c....+.......'...+............7.......?.....,..o...i.8!.......O....+...m..u9 ..Q.i./`.+.......&W........Q........I."_......%../...o...I. O.........O..+.W.s..+........!......-......Y..{....m..............o.......88....o........+.[...)..A.4rX...+.............................(P........e..........H7.......o.....+.2E...4yw.. ..e.t..`..2....k......27................2_../....2......o....8.O...N............O.+.[.G....r......+...."+.......).m./...............#../..O..y.hf....N....j...?.._.+...A....mi.... ..I.t...#.......[...'..{...]........HDO.._...$........O.....4?..._...?........+.[..e..r .y0...K.eA.+........X....K.Y...._......(?......#....h7.O../.....}/...........o.+.[...Fr...... 8.A5Sp.F.T..+..O.....)[......g..;.).../...Y.X'.......(O..........(.....O........+hw...M...+........3.........I..].O.....H.......H'../..........t.............&Q.....f0.`.2.,..V.'l....e...3.-P...AP.U..`..F.(tp.?.4..8....R..Ap.Th.D.....8_%........m......5./...C..L.._..R........w...........6.0....{.1.x.7.1.....@..TX.p.+...8..2.......9.3..M....TeP.b(.r..:.4.&..T.._.Up..Cy................._....L(....sv..ifp.....0.;...3...G..CH...A................&s......Q.'...........U..............o:(m..s.... ..e....'..'..w..3..Y.....2l..H.;.A...*..}]..s.......... /.. +.(<.)x.....8........YH..Pi.&x.1.9......7x....u.......o.]....d...ch...+. ..s.......N..`...........;.....wm..?...dh.].;.........y..........................I............x=/....m.... ....................'[....;. _...y..g......'.....?....O.Z.. ................?.......f...+...O......W..y...........c}$l.ov... ?.....G.........../..-....?....h..k......Y............../...&..TH.pH................y..&............0.Ga.DH...HW;.5....x.<.6...5.7..0@>-.0..TH..jQ.A......2.6..,(97.3..0....6q.1r.=.i....u.2...*.5p.68.7p.2..u.2r..[.B.?!r..$`.6..,B.8..18...%@.C3...5S........8p.8...=$.4!......k......3..9.....T6h.0..9(.........@..                          

Well, the result is different, but it still doesn't look like XMLA message - it still looks mostly like garbage. Only in the first row we see something that remotely resembles the string "1.0 UTF-16 http://schemas.xmlsoap.org", however the further we move through that string, the more distorted it becomes. For example, instead of "http:" it looks like "hTAtp:" and "xmlsoap" part is almost not recognizable. What happens now is that the XMLA message gets compressed before it is sent over the network. The patented compression algorithm that Analysis Services uses is an adaptive one - this is why the beginning of the message still can be read in clear, but quickly the algorithm picks up and the stream becomes very well compressed. Since XMLA responses can be quite big and may contain a lot of strings - compression of the messages should improve performance of the network transmission. This is especially true for the WANs and other low bandwidth networks. However, compression (and decompression) are not for free - they do take CPU time. Analysis Services supports 10 levels of compression which offer tradeoff between size and speed. The compression level can be controlled by means of connection string property "Compression Level". Integer values from 0 to 9 are valid compression levels. The default compression level is 9, and it can be changed at the server by changing the server config property Network/Responses/CompressionLevel. Note, that the encryption was done on top of compression and not vice versa. There are good reasons for that. One reason is to do with performance - because encryption tends to add entropy to the data, and data with higher entropy is not compressible as well as data with low entropy. Another reason is to do with security - it is harder for the attacker to try to break encryption when more parts of the encrypted message are not known. With XMLA there is a big percent of the response which can be predicted - all the XMLA standard tags - , , etc. When the data is compressed such attacks are harder to mount. Sometimes it may be useful to turn compression off completely - for example for troubleshooting - this can be achieved by connection string property "Transport Compression". The two valid values are (note that they are case sensitive)

  • "None" - for disabling compression
  • "Compressed" - for enabling compression
  • "Default" - for default behavior (same as "Compressed" in this version)

It is important to note here, that everything that we said about compression, applies by default only to server responses, but doesn't apply to client requests. By default no matter what properties are set - the requests are always uncompressed. The reasoning for such behavior is because of the observation that requests are typically small (just the content of MDX query) and responses are large. But why not to compress the requests anyway - even if they are small - what's the harm in it - you may ask. Here we need to be reminded about Trustworthy Computing again. The server is designed to be robust and withstand hackers attacks to break it. One of the common techniques to achieve this goal is called  Reducing Surface Area of Attack. By default server simply doesn't even accept requests which are compressed, therefore if somebody will find security hole in the decompression algorithm - he won't be able to exploit it to attack the server. Of course we hope that no such hole would be ever found, but by disabling decompression code on the server we create additional layer of defense. The assumption that requests are small is usually a correct one, however, there are few scenarios when requests can become quite big. The most important scenarios probably are

  • Pipeline processing - when Integration Services pipeline pushes data to Analysis Services either for cube processing or mining model training
  • Massive cell writeback - when client application updates a lot of cells and sends them to the server
  • Frequent creation of complex UDMs - when XML DDL which defines UDM becomes very big and is sent to the server frequently
  • Certain data mining scenarios - again when there is a lot of data to be pushed to the server

If you ever run into this or similar situation, you can always change the default server behavior by changing the following server config property "Network/Requests/EnableCompression" from the default value 0 to value 1. Now to continue our investigation of AS protocol let's connect with connection string "Protection Level=connect;Transport Compression=None" and again execute our MDX query. This time the response is notably bigger since no compression was applied to it. Therefore we will only paste here the first few bytes

....application/sx.........1...0...U.T.F.-.1.6...)h.t.t.p.:././.s.c.h.e.m.a.s...x.m.l.s.o.a.p...o.r.g./.s.o.a.p./.e.n.v.e.l.o.p.e./...s.o.a.p...E.n.v.e.l.o.p.e..........x.m.l.n.s.:.s.o.a.p........)h.t.t.p.:././.s.c.h.e.m.a.s...x.m.l.s.o.a.p...o.r.g./.s.o.a.p./.e.n.v.e.l.o.p.e./...B.o.d.y.........&u.r.n.:.s.c.h.e.m.a.s.-.m.i.c.r.o.s.o.f.t.-.c.o.m.:.x.m.l.-.a.n.a.l.y.s.i.s...E.x.e.c.u.t.e.R.e.s.p.o.n.s.e.........x.m.l.n.s........&u.r.n.:.s.c.h.e.m.a.s.-.m.i.c.r.o.s.o.f.t.-.c.o.m.:.x.m.l.-.a.n.a.l.y.s.i.s...r.e.t.u.r.n.........0u.r.n.:.s.c.h.e.m.a.s.-.m.i.c.r.o.s.o.f.t.-.c.o.m.:.x.m.l.-.a.n.a.l.y.s.i.s.:.m.d.d.a.t.a.s.e.t...r.o.o.t...........0u.r.n.:.s.c.h.e.m.a.s.-.m.i.c.r.o.s.o.f.t.-.c.o.m.:.x.m.l.-.a.n.a.l.y.s.i.s.:.m.d.d.a.t.a.s.e.t...x.m.l.n.s.:.x.s.i........)h.t.t.p.:././.w.w.w...w.3...o.r.g./.2.0.0.1./.X.M.L.S.c.h.e.m.a.-.i.n.s.t.a.n.c.e...x.m.l.n.s.:.x.s.d........ h.t.t.p.:././.w.w.w...w.3...o.r.g./.2.0.0.1./.X.M.L.S.c.h.e.m.a... h.t.t.p.:././.w.w.w...w.3...o.r.g./.2.0.0.1./.X.M.L.S.c.h.e.m.a...x.s...s.c.h.e.m.a..........t.a.r.g.e.t.N.a.m.e.s.p.a.c.e........0u.r.n.:.s.c.h.e.m.a.s.-.m.i.c.r.o.s.o.f.t.-.c.o.m.:.x.m.l.-.a.n.a.l.y.s.i.s.:.m.d.d.a.t.a.s.e.t...e.l.e.m.e.n.t.F.o.r.m.D.e.f.a.u.l.t.........q.u.a.l.i.f.i.e.d....0u.r.n.:.s.c.h.e.m.a.s.-.m.i.c.r.o.s.o.f.t.-.c.o.m.:.x.m.l.-.a.n.a.l.y.s.i.s.:.m.d.d.a.t.a.s.e.t...x.m.l.n.s.:.x.s........ h.t.t.p.:././.w.w.w...w.3...o.r.g./.2.0.0.1./.X.M.L.S.c.h.e.m.a....c.o.m.p.l.e.x.T.y.p.e.........n.a.m.e..........M.e.m.b.e.r.T.y.p.e....s.e.q.u.e.n.c.e.........a.n.y.........n.a.m.e.s.p.a.c.e.........#.#.t.a.r.g.e.t.N.a.m.e.s.p.a.c.e...m.i.n.O.c.c.u.r.s.........0...m.a.x.O.c.c.u.r.s.........u.n.b.o.u.n.d.e.d...p.r.o.c.e.s.s.C.o.n.t.e.n.t.s.........s.k.i.p......a.t.t.r.i.b.u.t.e...........H.i.e.r.a.r.c.h.y...t.y.p.e.........x.s.:.s.t.r.i.n.g..........P.r.o.p.T.y.p.e......e.l.e.m.e.n.t...........D.e.f.a.u.l.t.....0..........n.a.m.e.....x.s.:.s.t.r.i.n.g...u.s.e.........r.e.q.u.i.r.e.d.........t.y.p.e.....x.s.:.Q.N.a.m.e..........T.u.p.l.e.T.y.p.e..........M.e.m.b.e.r......M.e.m.b.e.r.T.y.p.e.....u.n.b.o.u.n.d.e.d...........M.e.m.b.e.r.s.T.y.p.e..........M.e.m.b.e.r......M.e.m.b.e.r.T.y.p.e.....0.....u.n.b.o.u.n.d.e.d..........H.i.e.r.a.r.c.h.y.....x.s.:.s.t.r.i.n.g.....r.e.q.u.i.r.e.d...........T.u.p.l.e.s.T.y.p.e..........T.u.p.l.e.....T.u.p.l.e.T.y.p.e.....0.....u.n.b.o.u.n.d.e.d.......g.r.o.u.p.... ......S.e.t.T.y.p.e....c.h.o.i.c.e....!........M.e.m.b.e.r.s.....M.e.m.b.e.r.s.T.y.p.e.........T.u.p.l.e.s......T.u.p.l.e.s.T.y.p.e.........C.r.o.s.s.P.r.o.d.u.c.t.....S.e.t.L.i.s.t.T.y.p.e.........U.n.i.o.n........r.e.f...."....S.e.t.T.y.p.e.....0.....u.n.b.o.u.n.d.e.d.............S.e.t.L.i.s.t.T.y.p.e........S.e.t.T.y.p.e.....0.....u.n.b.o.u.n.d.e.d.........S.i.z.e.....x.s.:.u.n.s.i.g.n.e.d.I.n.t..........O.l.a.p.I.n.f.o..........C.u.b.e.I.n.f.o............C.u.b.e.....u.n.b.o.u.n.d.e.d............C.u.b.e.N.a.m.e.....x.s.:.s.t.r.i.n.g.........L.a.s.t.D.a.t.a.U.p.d.a.t.e.....0.....x.s.:.d.a.t.e.T.i.m.e.........L.a.s.t.S.c.h.e.m.a.U.p.d.a.t.e.....0.....x.s.:.d.a.t.e.T.i.m.e...............A.x.e.s.I.n.f.o............A.x.i.s.I.n.f.o.....u.n.b.o.u.n.d.e.d............H.i.e.r.a.r.c.h.y.I.n.f.o.....0.....u.n.b.o.u.n.d.e.d............#
Now this looks much more human readable then before. The reason for the dot after every character is because all the strings are in Unicode, and therefore their second byte is NULL. But otherwise what we see here looks very much like SOAP Envelope with the content looking like XSD schema for the MDDataSet result - just like in the XMLA 1.1 spec. However, there is still one strange thing about this result. We would expect XML, which means familiar < and > brackets around tags - and they are nowhere to be seen. To make things even more interesting, let's take a look at the end of the message, where we should have CellData section containing actual cell values.
.......C.e.l.l.D.a.t.a.....;.5..C.e.l.l.....<.6...5.7..0...-.....TH.A......2.6.6.,.7.7.3...0.0....6.7..1...-.=.i....A......2.2.5.,.6.2.7...2.3....6.7..2...-..[.B.?!A.....$.5.6.5.,.2.3.8...1.3....6.7..3...-.5S...........8.6.8.3.7....6.7..4...-.....k..A......3.3.9.,.6.1.0...9.0..........                                    

This is even more interesting - note only < and > are missing, but all the XML tags such as , , as well as attributes such as - everything is missing. There is single CellData tag and single Cell tag, but there are no closing tags. And the result contains some binary characters as well. All this is pretty strange, after all we removed all the layers which could add binary content - such as compression and encryption - so we should get clear XML now. Well, we kind of got XML here (or, more precisely, we got XML Infoset). There are many different encoding for XML, two most popular being UTF-8 and UTF-16. However, there are more encodings, and there is no reason why someone wouldn't come up with new encoding. So this is exactly what AS does. By default it uses proprietary encoding called SX. It is exactly same encoding that SQL Server uses when it communicates with its OLEDB driver when it sends results for the SQL SELECT statements which use "FOR XML" clause. I have heard people calling it sometimes "binary XML", but I really don't like this term. There is a lot of confusion around this term with many different people calling many different things "binary XML" - now W3 standard body has a special working group to study "binary XML". Analysis Services simply uses encoding called SX to encode XML. The way SX works is maintains a dictionary for things like XML tags, namespaces attribute names etc - so the actual string would appear only once, and after that it is encoded in the dictionary. Closing tags are never emitted, because of the symmetry property in XML. These things mostly help with the size of the message, but what perhaps is even more important is that SX encoding is XSD schema aware which means that elements are stored in their native data type. To explain this with example - suppose we have element with name Value and type double. In UTF-8 encoding it would look something like

3.3961089640000253E5

As you can see the actual element value has been converted to (rather lenghty) string. However with SX encoding this example would look like just 8 bytes holding the actual binary representation of the value. This goes beyond just saving space - it actually saves CPU time and memory copying - since none of the cell values need to be converted to strings for sending out into request - they are just sent in their native representation. The data type awareness is, in my opinion, the most important characteristics of SX encoding. Of course if the data type of the element is string - then it will be a string. SX encoding only works with Unicode strings, which is natural fit for AS, since it also keeps all of its strings in Unicode. It is possible to control which encoding AS will use by means of connection string property "Protocol Format". It can be one of the following values

  • "XML" - UTF-8 encoding is used
  • "Binary" - SX encoding is used (the value name is unfortunately confusing)
  • "Default" - the default behavior (same as "Binary" in this version)

The same comment about responses vs. requests which was done in the compression section also applies here. By default server doesn't accept requests in SX encoding - but pretty much in the same cases where you would want to enable compression on requests you would want to enable SX encoding on requests as well. This is done by changing server config property with misleading name Network/Requests/EnableBinaryXML from default value 0 to value 1. We can now connect again to the server this time using new connection string "Protection Level=connect;Transport Compression=None;Protocol Format=XML" and send our query. This time the result looks like following (middle section skipped for brevity):

................text/xmlAll Store Size in SQFT[Store Size in SQFT].[(All)]01000[Store Type].[All Store Type]All Store Type[Store Type].[(All)]01000<CellData>2.66773E5266,773.002.2562723359999983E5225,627.235.652381299999928E5$565,238.1386837868373.3961089640000253E5339,610.90          

This now looks very familiar - it is a classic XMLA response in MDDataSet format and UTF-8 XML encoding. One interesting thing to discuss between SX and UTF-8 encodings is that in SX, as we noted above, all the strings are in Unicode, therefore taking twice as much space as those in UTF-8. So it may seem that UTF-8 could be a better encoding, especially if there are a lot of strings in the resultset, since it will be smaller. In practice. however, since AS keeps all of its strings in Unicode, when UTF-8 is used - it spends CPU time to convert from Unicode before sending, and back to Unicode upon receiving. And since by default the message is being compressed - all these extra NULL bytes get eliminated anyway.

So now that we uncovered all of the layers, let's discuss the actual binding of XMLA message to the TCP/IP protocol. We know that for HTTP this is done through HTTP headers, and there is no standard binding of SOAP to TCP/IP. Actually, at some point (around year 2000 when we started Yukon), there was an attempt to standardize SOAP bindings to TCP/IP through protocol called DIME. You can find description of DIME here - http://msdn.microsoft.com/msdnmag/issues/02/12/DIME/default.aspx and the formal specification here - http://www.gotdotnet.com/team/xml_wsspecs/dime/draft-nielsen-dime-01.txt. This spec never made it into real standard and expired (in favor of WS-Attachments I suppose), but Analysis Services successfully adopted it - this simple and lightweight protocol really fit all the needs without unnecessary overhead. So now it should be clear what are these few bytes at the beginning of each response - it is the DIME header, and the strings that we saw at the beginning are DIME TYPEs. This is why compressed and SX encoded message had DIME type "application/sx+xpress", but when we removed compression it became "application/sx", and finally in the clear text encoding it was "text/xml".

In the closing I summarize that we saw some of the details of what exactly goes on the wire when client connects to the server over TCP/IP, we learned several connection string properties and server config properties which affect the data format, we discussed security and performance implications of such changes and justifications for the default values. Please remember, that all the discussion was about TCP/IP protocol, and when Analysis Services is set up over HTTP - the picture is different, although there are some similarities in the stack - perhaps separate article is warranted to discuss what happens exactly with HTTP. Also note that this information applies to the versions of OLEDB and ADOMD.NET which shipped together with Analysis Services 2005. You may know that there is version of ADOMD.NET which shipped about year and half ago on the Web. It can connect to the AS2005 server, and it can take advantage of some layers discussed here (such as encryption), but it doesn't have support for other layers (compression and SX encoding) therefore when  AS2005 server detects that it works against that early version of ADOMD.NET, it downgrades protocol to text/xml. Therefore for better performance it is recommended to upgrade ADOMD.NET to the newer version.

Hopefully armed with all the information in this article, the Analysis Services protocol will become demystified for you and you will be able to apply this knowledge in the real-world settings.