CVPR 2019 Day 1

What an early flight to Long Beach! Waked up at 3:30am and noticed that there was Lyft driver available at mid-night. Have to say that Lyft/Uber makes life easier. But just a reminder that SJC won’t open checkin until 4:00 am… So don’t be rush there anyway.

So Sunday and Monday are for the workshops. The morning I went to the 3D Scene Understanding workshop and listen to a good talk on “What do Single-view 3D Reconstruction Networks Learn?” It points out the current state of the arts single image reconstruction work is, by large chance, just image retrieval. This is due to that the shape similarity measurement is not good enough and the training set is contaminated by models which already looks very similar like the one in the test set. And using a certain model pose view as the single image input fits 2D image case but does not really the best one for 3D mesh case. The talk really resolve some issues in 3D reconstruction research and I think the paper is good to read. You can find the paper Here. And here is the Youtube video for the talk

However, at the same day Facebook AI also provide their Mesh R-CNN basically to reconstruct mesh from single image like there R-CNN to create 2D mask from a single image. Hence it would be interesting to check that paper to see if it violates any issues points out by the work above.

In the afternoon my colleague leads me to the ScanNet benchmark challenge workshop. Professor Matthias Nießner is really active in facial/body reconstruction work and now his work also expand to general scene 3D capture and registration. ScanNet is trying to create data set with vertex level labeling plus 3D bounding box , like the 3D version of the ImageNet. The workshop is basically an exhibition of all the people who participate in the detection task with the ScanNet dataset. The Stanford’s work achieve very good result by take advantage of temporal coherence information. It is a very interesting idea and fundamentally optimize the data representation and training procedure. Very nice result.

Later afternoon I went back to my original research domain to take look of the state-of-the-art human body/facial capture survey style talk given by Michael J Black. I do feel there is great potential. Need to investigate when I have some spare time. And here is the video for that.

I think Monday I will be in the AR/VR session. Hope to learn more in this region. Or at least see what is the part people still not cover yet…

旧金山日本五年多次往返签证申请经验

这里我总结一下自己在旧金山日本领馆签证的经验。旧金山日领馆负责加州中北以及内华达的签证服务。而且要求是自己来递签。

第一,材料准备。这个网站总结的很好:https://piao.tips/japan-multi-visa-in-the-us-san-francisco/。不过我自我的感觉下来还是有东西要强调一下。

官方网站找到了很贴心的,针对不同国籍的签证check list:https://www.sf.us.emb-japan.go.jp/itpr_en/e_m02_01_04.html

个人在面签时候是完全按照这个check list收材料的,不多不少。与上面驴友总结的稍有不同,那个自己的一封信说明要多次签证理由的信还蛮重要,真的被收走了。而且上文驴友那一份是针对以前去过日本再签写的,所以不要照抄哈。

关于申请表格,第一页名字除了写英文,还要在相应地方写中文的汉字,反面签名我也是用汉字签的。注意日本表格的时间写法,DD/MM/YY而不是一般的月份开头。

关于check list 里面第6条财产文件,我是两个都准备了。pdf里面强烈说明了如果是选择出工作证明+最近工资单,那么工作证明不能是Offer letter。各个公司开工作证明的方法不一。以Amazon为例,在内网搜索Employment Verification可以很容易找到拿到工作证明的步骤,网上操作就可以,不需要联系HR。注意default pin哈。

至于多数面经里面提到的机票打印和旅店订单,行程计划,大使馆工作人员倒没有问我要。不过我还是有准备了。注意酒店订单一定要有自己的名字。最好自己定可以退的那种,用来签证就够。之后再退掉。免得上面有多个名字或者没有你的名字,万一签证官看到了还给自己找麻烦。

在面签过程中,签证官有问在日有没有朋友,建议说没有。否则会要求写出很详细的对方的信息。

整个签证过程很快,大概5分钟。之后会收到一个收据,5个工作日后凭收据来大使馆窗口领取护照。记得带正确金额现金(收据上会写明金额),不找零的,感觉也不能信用卡支付。

如果不想再跑一趟而且市里也有朋友的话,可以找人带领。只要你在收据上写好代领人的名字,并且用中文签你自己的名即可。中文前面的原因是表格要求签名必须和你护照签名吻合。是中国护照所以还是签中文吧。代领人的名字感觉得写英文,因为代领人要出示photo ID,如果代领人用美国驾照,那当然只有英文了。日本人办事很中规中矩,所以按规矩比较好。

交通

大使馆早上9点30开门,如果从南边过去,建议坐早上7-8点的caltrain,直到旧金山终点。然后走两分钟去坐10号公交。这个通勤在Google Map上可以搜到。旧金山公交票是2块75,交费后会得到收据,2小时内换乘都是免费的。所以理论上如果大使馆人不多可以来回到Caltrain终点。我签证一共花了大概1小时,到大使馆的时候领取的号码是15号,而刚到的时候才开始办1号。大概是这个速度。

 

SIGGRAPH 2018 Day 4

Today is a little casual. In the morning, I visit Nvidia ray  tracing/path tracing section. They emphasize that similar like the first GPU card in 1998, The RTX is a new thing that everyone should try to catch up with.

Then I also went to 3D capture session. The papers there are all very interesting. I think we are also at an important stage at the moment.

In the afternoon, I went to material capture session, It is glad to see how a deep learning model is trained which can use differentiable render for material generation from one image file. I do need to look into this work.

SIGGRAPH 2018 Day 3

Today is a time space mixture adventure. Try to get into the talk of two state of the arts face related paper in the VR session in the morning. One is from TMU and the other is from Facebook Reality Lab. Both of them try to tackle the issue on how to show genuine whole face expression in VR while both sides wear the headset.

On one side,  Matthias Niessner and his golden face synthesize team explore how to deal with this issue based on their face-to-face work. The advantage is that due to they use generic face model, the representation is not strongly subject dependent so no calibration and pre-capture is necessary. However, due to only use infrared camera inside headset for eye gaze tracking, the upper face’s expression may not be preserved.

While facebook use subject dependent high quality model for this work. And use deep learning on teeth composition. The quality looks better. However, it needs pre-capture for the subject.

 

And thanks to my friends from Pixar, this time we notice that there is no booth for the animation studio so we don’t know where to pick up the renderman teapot. It is turn out that they release after their renderman 22 demo talk which last for 1 hour. It is actually a really good talk. 30 years development of renderman, from scan line renderer to ray tracing, and then path tracing. They give up old infrastructure for physical correct and simple models. It is glad to see at this stage, ray tracing lighting can be achieved in an interactive speed. With the help of Nvidia’s RTX, I think the production time for all stages of animation can be shrink and we could see more ideas in the movie since the cost to try out new story line, camera, actions, etc are cheaper. But the most important thing is get my teapot!

 

 

 

 

 

 

The real-time live! demo session is also crazy. The Nvidia RTX, ILM X LAB, and Unreal combined VR virtual movie shot demo is totally a game changer on how we can make movie quality shots in real time with everyone inside a virtual environment. I can image in near future, the individual shot may be captured in this real-time ray tracing environment. Then the director can cut the movie to review, and handle that short to the off-line renderer, if necessary, for movie final images.

SIGGRAPH2018 Day 2

So today’s major coverage is two speeches, one is from Rob Bredow, VP of ILM. The other is form the CEO of NVIDIA.

Rob’s talk is the power of creative process. In which he talked about his experiences to be the first time VFX producer on the Star Wars movie: Solo.

He mentioned the people will have 3 different stages during the creative process:

  • Just start: when you want to be in the field.

During the beginning, people should do study. And try to build the things from other’s work. More like interdisciplinary study. It is easier to create something based on other stuff.

  • Know the theme: when you already know the tools and try to actively work in the field.

 

  • Lead: How to lead creative process.

During this stage, people need to first define the theme, which is the concept you try to follow. Make sure to work on this path before dive into the detail. He use the example on the solo film where he hope to go back to the classic 70’s film style. Hence the movie production explicitly uses rig for the hyper speed traveling set, and under water explosion, which relies on the real hardware (huge 180 degree LED screen, and 20 thousands fps camera) to get real lighting and “explosion never seen before”.

Then it is about learn on the constraint, so people can focus on the right thing. He mentioned how the roller coaster in Disney’s Animal Kingdom was created. From the beginning when it is not fit into the style. Then people visit Nepal and found the story of Yeti to build up the story about Everest and Yeti for the roller coster.

Third is simplify. Try to make the target simple. He mentioned about a shot in World War where a rig is jumped out during a crash scene, which may need retouching the scene to remove. However, no one actually knows what that is and people pay attention to the character’s face, so it is indeed not that important to spend extra time on removing in the film.

The third is about share. Rob mentioned on the start of ASWF, the academy software foundation, where the film industry first time try to organize their software together to share tools between companies.

The topic title.
ASWF actually starts with a lot of big names. I think to explore these repositories could also help new people to get into the business.

 

 

 

 

 

 

 

 

He also proposed his photo book he made during the Solo movie, I think this is a very good collection.

 

Nvidia’s special event is crazy to attract a lot of people. It is also my first time to see the CEO’s iconic gesture: hold the nvidia card on the stage. The event is basically the announcement of the next big thing since CUDA introduced in 2003. The Turing architecture, where Nvidia makes real-time ray tracing rendering possible.

10 Gig ray per second, mixed operation on GPU 16 TFLOPS and 16 TIPS, 500 Trillion tensor ops per second, 8K image decoder. This monster makes real time ray tracing possible. It dramatically reduce the time of physical based rendering for movie quality images, hence could be very attractive to the movie industry. And since the basic version is not that expensive ($2300, I think it is worthy than some AR glasses), we may expect soon game developer may not need to play too much tricks on the shading effect while just let things following the physics law.

Mr. Huang really enjoys to use the high glossy RTX card to play with the audience.
Demo on the real time ray tracing Star Wars shot. The light does look real!
Introduce how different hardware/software stack it is for the new arch.

 

 

SIGGRAPH 2018 Day 1

Day one is so many people! Next time if I arrived one day earlier, I should do registration first.

So in the morning, I went to the Vulkan course, really helpful to understand this thing and glad it has all the support we expect to. I think it is the way to go.

Then we visit the product exhibition, nice to see the probs from Infinity War and Solo.

The AR session hosted by apple basically go over what they said at WWDC, which makes AR still a pretty new thing for the graphics industry. I can sense that people are looking for new things, but they hesitated on the future.

In this case, what should we do? The Jurassic park 25 year screening gives the answer. You just spend your spare time and do it. Then break the old business. From 0 to 1, that is how we make progress.

See everyone in day 2.

Hello to SIGGRAPH 2018!

I think time indeed goes fast and it has been 3 years since my first and amazing SIGGRAPH experience. Now it is Vancouver, with a new me on Amazon’s AR platform and try to make it better.

So Sunday is the beginning, I plan to check in and take course: Intro to Vulkan. And later afternoon for deep learning maybe (but I feel the deep learning one may be too simple).

中国护照旧金山领馆换发

本篇主要记录了在旧金山领馆换发护照的过程。所提供的信息基于2017年的情况,所以请在官网查看最新细则。

1.网上预约

a. 使馆目前应该只接受网上预约,从目前情况看,一约就是2个月以后。所以请根据自己的情况提前预约。

b. 提交照片是一个超级头疼的事情,预约系统会允许照片提交,但是是否合格在最后才会告诉你,而且不合格信息十分模糊。试了超级多次才能过关。其实你在这一步不提交也可以,领馆内有专用自拍机,10刀四张,满足护照标准大小,可以给你省心一些。所以只要记得到了预约时间穿深色衣服弄得美美的去拍照就行。

我的情况是正好有朋友从国内来,所以可以让他们按国内证件照标准洗4张给我所以才尝试网上提交的。在美国洗国内标准的照片应该不容易,但是网上应该有其他达人告知怎么冲洗,请各位自己去查。这里我仅介绍如何顺利提交数字照片。

首先,请给出正面,不笑,有眼镜的不要遮住眼睛,深色衣服的免冠照。你可以用手机用白墙为背景拍,记得光线要充足。

接着,请把头像裁剪成354 pixel x 472 pixel 大小。虽然预约系统允许你缩放,移动,但是亲测结果表明,你的照片只有是正好match他的模板,才有可能通过。你在里面做的平移,缩放等都代表照片可能不合格。还有要注意你裁剪照片时候眼睛的位置。

调整好后,你会发现提交可能还是失败,原因可能是你的背景不够白。我建议可尝试在photoshop里直接魔术棒把自己抠出来然后刷白背景,这样应该就可以通过了。

c. 提交完毕请彩色打印表格准备当天带去领馆。同时记得冲洗4张照片,按照要求大小哦。

2. 申请邮寄护照服务(如果自己去取,请跳到step 3)。

如果你不打算自己再回市里面领取护照,你可以申请邮寄服务。由于邮寄只接受money order或者cashier check,你最好把护照制作费和邮寄服务费一起做一张money order 或者 cashier check。其实即便你是自己去取,我也建议办money order 或者cashier check,因为排队信用卡付钱真的队伍好长。。。如果有check办好了的话,直接在提交表格的时候交给使馆人员就可以走了。

根据个人情况不同,money order或者cashier check需要手续费。因为money order手续费一般低一些,而且你要去拿信封和邮票,所以建议就都去邮局办money order吧。信封取USPS 的express 就可以,邮票应该是美国express统一价,$6.25。 请同下图一样贴在封口处,使馆人员可不会帮你贴。同时记得向邮局人员索要tracking label,如果你的邮票上本身没有 tracking 信息。tracking label 分两部分,上面都有一样的tracking number,你可以一个贴在信封上,一个自己留着用于查找。

所以money order 需要开具30美金,即目前护照制作费25 + 使馆邮寄费 5。

3. 当天注意事项。

记得带好需要的材料:

a. 旧护照

b. 护照所有有内容页面 (visa, 海关戳)的彩色复印件

c. 预约生成的表格

d. 照片4张(如果不是现场照)

e. 贴邮票的信封

f. money order

g. 最近一次出关的I94 (不带也没有特别大关系)

如果开车前往,记得看清路边牌子!!!使馆侧面的上坡路一侧是周五12点到下午不能停车。进入使馆后再4或者5号窗口排队。如果没有照相的朋友可以在进入护照办理处以后拍照再去排队。建议按你预约时间提前半小时进入领馆。

轮到你了以后,客客气气的把表格交给工作人员,提交相应材料。然后按指纹,签数字签名。他们收取信封以后会整理你的资料。然后一般官方是一个月以后会寄给你新旧护照(旧护照会剪角注销)。我的情况是3周就收到了。

We are what we choose to be

Today is May 6, 2017. I am working half a day on the project so later the app can make change irrelevant to the data set. From experiment to implementation, and error searching and debugging, days like today run though my mind, and this kind of feeling is what I endurance for the past 2 years.

It is always about focusing on one problem and cannot do anything else besides it. This can be applied to any other domain, such as job hunting, paper writing. I think this is what my advantages: Focus. But it is also my weakness if this happens very often in my daily life. This is bad if it is 24/7 since I will be very skill at doing one thing, but lose the big picture of the work, even life.

Even PhD study, as I think, is not only about concentration in my tiny region. It needs a deep understanding of one domain, and at the mean time to explore other domains for new idea.

That is why I feel my start-up life is not good for me. It has majorly been a journey about learning new things, which is not very applicable to a general knowledge. This is especially obvious when I prepare for CV/CG related job hunting. I think my start-up job do give me a very advanced vision on the topic and that is really what I appreciate. However, life and work balance is also critical in the long run. Besides, in my specific domain, I work alone for the 2 years, I do need communication to break for new idea and better solution. Learning alone is not a good habit.

The past two months really show the true face of living in real world rather than the irony tower. Making plan and do multiple things in pieces of time is what I try to do in the future.

I am a little regretful for that I am not brave enough to choose the way to do face. I hope I can have chance to do it but I do feel I should start with a more general topic. Face could be my side project especially when I get so many resources on the possible way to do it. But the job is the highest priority, after so many years, I think I come back to where I begin, and this time I hope I can polish my skill and create something elegant and useful.

ARBA: Augmented Reality Bay Area meetup

On Jan 24, 2017, we joint the first Augmented Reality meetup hosted at the Runway Incubator in San Francisco. This is also the first time I went to the meetup. 

So the above company basically creates a Kinect which can work with VR headset to produce body tracking.

This is an interesting device, a company named ultrahaptics creates a ultrasound matrix (where the green dot is under the palm), which give air pressure based on the computer display. In VR, this makes feeling the virtual object possible. However, a plane only gives force from one direction. To resolve this issue, a cube-shape array can be created with the blocks. This could be a good plugin for automobile control panel where ultrahaptics can provide feeling of real buttons so people don’t need to look at the panel while driving.

Wearing HoloLens in daily life really need a strong mind! All right, the presentation starts. First, Occipital demonstrates a smart mixture of AR/VR. They created a headset for Iphone. The headset also includes a depth camera. In the demo, they showed that how the depth camera can used to scan a room to create a 3D mesh model for that space. Then the texture is created from the iphone camera. After generating the 3D mesh of the space, the model is loaded into the VR set so it becomes a AR environment. In my opinion, this is an offline real world mapping trick that take advantage of the 3D reconstruction functions. As a result, the system may not ready to process real time point cloud data.

 

Next company Yowza shows a new idea to convert our real living space into digital world. In their idea, the raw mesh of the space is captured and uploaded to their cloud, then the point cloud is segmented and classified to different, completed furnature model in the dataset. Then the 3D models replace the raw mesh and ideally create a completed 3D scene for VR environment. 3D object recognition and 3D segmentation are hot topics in SIGGRAPH and CVPR. This company’s idea will be a very good feature for VR.

The last demo is a Tango based one from Clever Robot Labs. It analyzes the 3D point cloud from Tango phone to recognize the ceiling, floor, table, bed, etc in real time. Then it can replace them with VR contents dynamically. Interestingly, the algorithm can replace the real table based on the point cloud to scale the virtual table. Please see the video for the result.

After that, some new member also introduce themselves and also pop some job information. It is a very nice experience. Focusing more on technical side. And I am also glad that our CEO An Li has a good conversation with Ori Inbar. Hope that we can also join AR meetup in this year!