from: http://jacksondunstan.com/articles/1617
Stage3D Upload Speed Tester
Since Flash Player 11′s new Stage3D
allows us to utilize hardware-acceleration for 3D graphics, that entails a whole new set of performance we need to consider. Today’s article discusses the performance of uploading data from system memory (RAM) to video memory (VRAM), such as when you upload textures, vertex buffers, and index buffers. Is it faster to upload to one type rather than another? Is it faster to upload from a Vector
, a ByteArray
, or a BitmapData
? Is there a significant speedup when using software rendering so that VRAM is the same as RAM? Find out the answers to all of these questions below.
The below performance test checks the upload speeds in both hardware and software mode of all of these types:
-
Texture
from…- BitmapData
- Vector
- ByteArray
-
VertexBuffer3D
from…- Vector
- ByteArray
-
IndexBuffer3D
from…- Vector
- ByteArray
Check it out:
package { import flash.display3D.*; import flash.display3D.textures.*; import flash.external.*; import flash.display.*; import flash.sampler.*; import flash.system.*; import flash.events.*; import flash.utils.*; import flash.text.*; import flash.geom.*; import com.adobe.utils.*; public class Stage3DUploadTester extends Sprite { private var __stage3D:Stage3D; private var __logger:TextField = new TextField(); private var __context:Context3D; private var __driverInfo:String; private var __texture:Texture; private var __bmdNoAlpha:BitmapData; private var __bmdAlpha:BitmapData; private var __texBytes:ByteArray; private var __vertexBuffer:VertexBuffer3D; private var __vbVector:Vector.<Number>; private var __vbBytes:ByteArray; private var __indexBuffer:IndexBuffer3D; private var __ibVector:Vector.<uint>; private var __ibBytes:ByteArray; public function Stage3DUploadTester() { __stage3D = stage.stage3Ds[0]; __logger.autoSize = TextFieldAutoSize.LEFT; addChild(__logger); // Allocate texture data __bmdNoAlpha = new BitmapData(2048, 2048, false, 0xffffffff); __bmdAlpha = new BitmapData(2048, 2048, true, 0xffffffff); __texBytes = new ByteArray(); var size:int = __texBytes.length = 2048*2048*4; for (var i:int; i < size; ++i) { __texBytes[i] = 0xffffffff; } // Allocate vertex buffer data size = 65535*64; __vbVector = new Vector.<Number>(size); for (i = 0; i < size; ++i) { __vbVector[i] = 1.0; } __vbBytes = new ByteArray(); __vbBytes.length = size*4; for (i = 0; i < size; ++i) { __vbBytes.writeFloat(1.0); } __vbBytes.position = 0; // Allocate index buffer data size = 524287; __ibVector = new Vector.<uint>(size); for (i = 0; i < size; ++i) { __ibVector[i] = 1.0; } __ibBytes = new ByteArray(); __ibBytes.length = size*4; for (i = 0; i < size; ++i) { __ibBytes.writeFloat(1.0); } __ibBytes.position = 0; setupContext(Context3DRenderMode.AUTO); } private function setupContext(renderMode:String): void { __stage3D.addEventListener(Event.CONTEXT3D_CREATE, onContextCreated); __stage3D.requestContext3D(renderMode); } private function onContextCreated(ev:Event): void { __stage3D.removeEventListener(Event.CONTEXT3D_CREATE, onContextCreated); var first:Boolean = __logger.text.length == 0; if (first) { __logger.appendText("Driver,Test,Time,Bytes/Sec\n"); } const width:int = stage.stageWidth; const height:int = stage.stageHeight; __context = __stage3D.context3D; __context.configureBackBuffer(width, height, 0, true); __driverInfo = __context.driverInfo; __texture = __context.createTexture( 2048, 2048, Context3DTextureFormat.BGRA, false ); __vertexBuffer = __context.createVertexBuffer(65535, 64); __indexBuffer = __context.createIndexBuffer(524287); runTests(); if (first) { __context.dispose(); setupContext(Context3DRenderMode.SOFTWARE); } } private function runTests(): void { var beforeTime:int; var afterTime:int; var time:int; beforeTime = getTimer(); __texture.uploadFromBitmapData(__bmdNoAlpha); afterTime = getTimer(); time = afterTime - beforeTime; row("Texture from BitmapData w/o alpha", time, 2048*2048*4); beforeTime = getTimer(); __texture.uploadFromBitmapData(__bmdAlpha); afterTime = getTimer(); time = afterTime - beforeTime; row("Texture from BitmapData w/ alpha", time, 2048*2048*4); beforeTime = getTimer(); __texture.uploadFromByteArray(__texBytes, 0); afterTime = getTimer(); time = afterTime - beforeTime; row("Texture from ByteArray", time, 2048*2048*4); beforeTime = getTimer(); __vertexBuffer.uploadFromVector(__vbVector, 0, 65535); afterTime = getTimer(); time = afterTime - beforeTime; row("VertexBuffer from Vector", time, 65535*64*4); beforeTime = getTimer(); __vertexBuffer.uploadFromByteArray(__vbBytes, 0, 0, 65535); afterTime = getTimer(); time = afterTime - beforeTime; row("VertexBuffer from ByteArray", time, 65535*64*4); beforeTime = getTimer(); __indexBuffer.uploadFromVector(__ibVector, 0, 524287); afterTime = getTimer(); time = afterTime - beforeTime; row("IndexBuffer from Vector", time, 524287*4); beforeTime = getTimer(); __indexBuffer.uploadFromByteArray(__ibBytes, 0, 0, 524287); afterTime = getTimer(); time = afterTime - beforeTime; row("IndexBuffer from ByteArray", time, 524287*4); } private function row(name:String, time:int, bytes:int): void { __logger.appendText( __driverInfo + "," + name + "," + time + "," + (bytes/time).toFixed(2) + "\n" ); } } }
I ran this performance test with the following environment:
- Flex SDK (MXMLC) 4.5.1.21328, compiling in release mode (no debugging or verbose stack traces)
- Release version of Flash Player 11.0.1.152
- 2.4 Ghz Intel Core i5
- Mac OS X 10.7.2
And got these results
OpenGL (Direct blitting) | Texture from BitmapData w/o alpha | 22 | 762600.73 |
OpenGL (Direct blitting) | Texture from BitmapData w/ alpha | 18 | 932067.56 |
OpenGL (Direct blitting) | Texture from ByteArray | 18 | 932067.56 |
OpenGL (Direct blitting) | VertexBuffer from Vector | 42 | 399451.43 |
OpenGL (Direct blitting) | VertexBuffer from ByteArray | 5 | 3355392.00 |
OpenGL (Direct blitting) | IndexBuffer from Vector | 3 | 699049.33 |
OpenGL (Direct blitting) | IndexBuffer from ByteArray | 1 | 2097148.00 |
Software (Direct blitting) | Texture from BitmapData w/o alpha | 12 | 1398101.33 |
Software (Direct blitting) | Texture from BitmapData w/ alpha | 5 | 3355443.20 |
Software (Direct blitting) | Texture from ByteArray | 5 | 3355443.20 |
Software (Direct blitting) | VertexBuffer from Vector | 15 | 1118464.00 |
Software (Direct blitting) | VertexBuffer from ByteArray | 5 | 3355392.00 |
Software (Direct blitting) | IndexBuffer from Vector | 3 | 699049.33 |
Software (Direct blitting) | IndexBuffer from ByteArray | 2 | 1048574.00 |
There is a clear order of speed in all tests, regardless of hardware or software or type of GPU resource being uploaded to:
-
ByteArray
(fastest) Vector
-
BitmapData
(slowest)
Only the magnitude of the advantage changes with this. In particular, if you can manage to upload a vertex or index buffer from a ByteArray
, you’re assured a huge performance win.
Uploading texture data seems much faster in software compared to hardware: a 3x improvement. As for vertex and index buffers, it’s more of a mixed bag. Software is faster when uploading vertex buffers from a Vector
, hardware is faster when uploading index buffers from a ByteArray
, and the rest are a tie. Vertex buffers are curiously quicker to upload than index buffers. The difference is more dramatic with software rendering (3x faster) than hardware rendering (50% faster).
More so than ever before in my performance articles is it important to keep in mind that the performance results posted above are valid only for the test environment that produced them. These numbers may change on Windows, which uses DirectX instead of OpenGL, or any of a number of mobile handsets using OpenGL ES.
相关推荐
一个无须安装,不写注册表的小程序。专门用来检查网页的载入速度,特别适应用于网站开发者进行页面加载速度测试。使用非常简单,输入目标网址,设定载入次数即可,如果是
Battery Tester Battery Checker英文版 电池测试仪Cell Tester Battery Tester (0-10V) Model:W602Y Battery Tester (0-20V) Model:W404Y English Version: Both of the operation and display are in English.. ...
1、TCP_Tester
SNMP测试工具,Paessler snmp tester
PageSpeedTester一个无须安装,不写注册表的免费小程序。专门用来检查网页的载入速度,特别适应用于网站开发者进行页面加载速度测试。使用非常简单,输入目标网址,设定载入次数即可,如果是多次测试,它会自动给出...
Rational Functional Tester 可以操控被测控件、完成用户指定的自动测试动 作,但前提是它需要具备与被测应用程序(Application Under Test,AUT)进行通讯的能力。要做到这一点,Rational Functional Tester 首先必须...
Rational Performance Tester
Paessler SNMP Tester 中文版,中文的版本,可以测试snmp,获取UID、连通性等
Snmp_Tester测试软件 压缩包内有"crack.exe"或者"破解补丁.exe"文件
谷歌接口测试插件Talend API Tester - Free Edition类似于PostMan。
Software Tester Certification Advertisement
RegexTester.exe 正则表达测试。。。
Tester工具的使用, 我一直在用 挺好用的! 简单明了!
Chrome插件-Talend API Tester 适用于后端开发人员,开发自测后端接口API,功能类似Postman
GoGo Tester 测试
Rational Performance Tester Install 性能测试工具 安装步骤
键盘记录测试Anti-Keylogger Tester AKLT 3.0
改进 Rational Functional Tester 启动应用程序的过程 基于Rational 的BS 架构软件自动化测试研究 使用Rational Functional Tester 完成自动化功能测试 详解IBM Rational Functional Tester 的工作原理
IBM Rational Performance Tester
一个.Net下的测试正则表达式 工具Regex Tester,打开压缩包,直接运行RegexTester.exe