Category Archives: Tech

On the Meta Ray-Ban Display

  • They’re moving the smartwatch display to a monocular lens.
  • The neural wristband is as important and vital to the glasses as the display, if not more so.
  • This is a first step toward what was shown in the Project Orion demo.
  • This is not 3D or 6DOF, no obvious uses for games.
  • This is meant for displaying information and basic 2D interaction:
    • phone calls (audio/video)
    • photos
    • shortform video
    • notifications
    • messages
    • basic local navigation
  • This is definitely NOT an entertainment device.
  • Unlike the smartwatch, this is also probably not a health device (potential room for Apple here).
  • Meta remains one of the scummier conglomerates in Silicon Valley.

VR/AR News + Fun Links

Progress on Quest/Horizon OS

Since I last wrote on Meta Quest OS (v64 in April), lots of improvements have happened. Per the changelog:

  • v65
    • upload panoramic photos or spatial video to headset via Quest mobile app (supports IOS 17 or later)
    • passthrough environment
    • fewer interruptions from hand tracking when using a physical keyboard or mouse with headset
    • Local multiplayer and boundary recall with Meta Virtual Positioning System
    • Travel Mode for airplane flights (experimental, optional, available only for Quest 2 and 3)
  • v66
    • improvements to passthrough, including [significant] reductions in warping
    • adjustments to exposure, colors, and contrast
    • improvements to background audio for 2D apps, including minimizing apps without automatically pausing playback
    • media controller moved out of notifications into a Media Control Bar under the universal menu to control media playback
    • wrist buttons for clicking Meta and Menu icons (experimental)
    • ability to hide any app (installed or uninstalled) downloaded from the Quest Store
    • teens and children ages 10-12 who are supervised by the same parent or guardian are automatically able to see each other in the Family Center (starting June 27)
    • Sleep Mode added to power-off menu
    • Space Setup automatic identification and marking of furniture (windows, doors, tables, couches, storage, screens, and beds, with additional furniture types supported over time) (documented, optional)
  • v67
    • New Window Layout (experimental):
      • expanded maximum number of open windows from three to six in window layout (up to three docked and three attached)
      • ability to grab and detach windows to position and resize them freely
      • button to temporarily hide other windows in immersive view
      • ability to take any window fullscreen, thus replacing other windows and replacing the dock with simplified control bar with buttons for toggling curving, passthrough background, and brightness of background.
      • replaces explicit Close View and Far View modes
    • new creator videos in Horizon Feed
    • ability to use swipe typing to enter text when using headset
    • improvements to eye tracking recalibration (Quest Pro only)
    • select different durations for Do Not Disturb
    • Wi-Fi QR code scanner (Quest 3 only)
    • open Quest TV or use File Viewer to watch immersive videos without quitting current immersive app
    • developers allowed to swap out Boundary for Passthrough in their apps

Also, the verdicts on the most available VR/AR glasses:

  • Mobile XR glasses
    • Brilliant Frame has major issues with functionality
    • Meta Ray-Bans are top notch
    • TCL Ray Neos do most of what is advertised but has potential for more
  • Stationary XR glasses
    • Rokid is meh
    • Xreal Beam Pro is an improvement upon Xreal Beam, expands capabilities of the Xreal Air 2 Pro
    • Viture Pro holds up to Xreal Air 2 Pros, decent for gaming (especially with Viture Neckband)

Videos

Neurodiverse Friends: Schizophrenia SKIT

I’m now a fan of this animator’s output. Their series on Neurodiverse Friends uses animated cats to accurately describe expressions of conditions on the spectrum.

Queen Coke Francis: Ranking Mr. Birchum Yaoi

Context: Mr. Birchum is an unfortunate adult animated series produced by the right-wing website Daily Wire which attempts to be a comedy. Not only are the jokes a collective dud, but quite a few conservatives themselves are turned off by the presence of one (1) openly-gay character in the cast, who is meant to be the butt of the jokes anyway.

Anyway, here’s a ranking of the yaoi made of the series.

F.D. Signifier: Kanye was never good

This puts Kanye and his downfall in a new light.

Mashups

In a way, I’m glad that YouTube is not the total repository for fan-created music out there.

Meta Had a Big Two Weeks with Quest

Meta had an intense two weeks. First:

Quest Update v64 Improves Space Setup and Controllers

Quest Platform OS v64, released April 8, came with two undocumented features:

  • Automatic detection and labeling of furniture objects during Space Setup
  • Detection and tracking of both hands and Meta brand controllers

It is interesting that this automatic detection and labeling of objects made it into a regular release of the Quest OS firmware:

  • Without documentation nor announcement
  • Within a month after first appearing in a video of a research paper from Reality Labs

My theory is that someone else at Meta may have seen the SceneScript video and thought that the automatic detection and labeling would be a good feature to try adding separately to Quest OS, but as both an Easter egg as well as an experimental option, and are anticipating user feedback on how well this implementation is performing without making it obvious.

To compare it to SceneScript below:

It is slower and not as complete as the SceneScript videos, but it definitely seems to save time when adding furniture during Scene Setup. This definitely moves the Quest Platform further in the direction of frictionless setup and operation.

In fact, this is such an undocumented feature that it was only found and confirmed by Twitter users and publicized by one YouTuber, The Mysticle, and the AndroidCentral blog at the time of writing. So this is very new.

Also, I don’t know if the Vision Pro has automatic labeling of furniture yet, although it has automatic room scanning with the sheer number of cameras built in.

What may be coming in v65

Currently reading about changes in the code of v65 PTC, some things stand out as possibilities:

I guess we’ll find out in early May.

Rebrand and Third-Party Licensing of OS and App Store for Meta Quest

The news from Meta’s April 22 announcement:

  • Meta announced that they’re renaming the OS for Quest devices to Horizon OS, as well as the Quest Store to Horizon Store.
  • Also: Horizon OS will be licensed to third party OEMs, starting with Asus and Lenovo.
  • App Lab will be merged into Horizon Store, App Lab name will be retired, Horizon Store will have more lax threshold for accepting app/game submissions.
  • A spatial software framework will be released for mobile app developers to port their apps to Horizon OS.

My thoughts:

  • This is the first time that the operating system for Quest 3 and its predecessors has had an actual brand name.
  • They’re really wanting to distinguish their software and services stack from their hardware.
  • Surprised they didn’t rename the Quest Browser as Horizon Browser. Maybe that will come later?
  • This may be the cementing phase of XR headsets as a computing form factor, a maturation from just an expensive magical toy/paperweight to play games with.
  • Two drawbacks: more hardware to design the OS for, and probably a slower update cycle than the current monthly.
  • We will probably need an FOSS “third thing” as an alternative to both visionOS and Horizon OS.
  • XR hardware may flourish and advance by using Horizon OS instead of their own embedded software. Pico, Pimax and HTC a come to mind as potential beneficiaries of this.
  • Meta may use this as a launchpad for extending Horizon OS into other form factors, like something that can straddle the gap between XR hardware on one end and the Meta Ray-Ban Glasses on the other end.
  • In terms of software feature parity, Meta has been trying to play catch-up with Apple’s Vision Pro since February, and have made it plain that Apple is their real opponent in this market. Google is merely a bothersome boomer at this point.

Other news

VIDEO: Cornell Keeps It Going with/ Sonar + AI: Now for a Hand-tracking Wristband

Now Cornell’s Lab have come up with yet another development, but not a pair of glasses: EchoWrist, a wristband using sonar + AI for hand-tracking.

(This also tracks from a 2022 (or 2016?) paper about finger-tracking on smartwatches using sonar (paper).)

Based on what I’ve read from this Hackerlist summary as well as Cornell’s press release, this is a smart, accessible, less power-hungry and more privacy-friendly addition to the list of sound+AI-based tools coming out of Cornell for interacting with AR. The only question is how predictive the neural network can be when it comes to the hand gestures being made.

For comparison, Meta’s ongoing neural wristband project, which was acquired along with CTRL Labs in 2022, uses electromyography (EMG) and AI to read muscle movements and nerve sensations through the wrist to not only track hand, finger and arm positioning, but even interpret intended characters when typing on a bare surface.

There shouldn’t be much distance between EchoWrist, EchoSpeech and using acoustics to detect, interpret and anticipate muscle movements in the wrist (via phonomyography). If sonar+AI can also be enhanced to read neural signals and interpret intended typed characters on a bare surface, then sign me up.

EDIT 4/8/23: surprisingly, there is a way to use ultrasound acoustics to record neural activity.

Video of EchoWrist (hand-tracking wristband)

Video of EyeEcho (face-tracking)

Video of GazeTrak (eye-tracking)

Video of PoseSonic (upper-body tracking)

Video of EchoSpeech (mouth-tracking)

Cornell Does It Again: Sonar+AI for eye-tracking

If you remember:

Now, Cornell released another paper on GazeTrak, which uses sonar acoustics with AI to track eye movements.

Our system only needs one speaker and four microphones attached to each side of the glasses. These acoustic sensors capture the formations of the eyeballs and the surrounding areas by emitting encoded inaudible sound towards eyeballs and receiving the reflected signals. These reflected signals are further processed to calculate the echo profiles, which are fed to a customized deep learning pipeline to continuously infer the gaze position. In a user study with 20 participants, GazeTrak achieves an accuracy of 3.6° within the same remounting session and 4.9° across different sessions with a refreshing rate of 83.3 Hz and a power signature of 287.9 mW.

Major drawback, however, as summarized by Mixed News:

Because the shape of the eyeball differs from person to person, the AI model used by GazeTrak has to be trained separately for each user. To commercialize the eye-tracking sonar, enough data would have to be collected to create a universal model.

But still though, Cornell has now come out with research touting sonar+AI as a replacement for camera sensors (visible and infrared) for body, face and now eye tracking. This increases the possibilities of VR and AR which is smaller in size, more efficient in energy and more responsive to privacy. I’m excited for this work.

Video of GazeTrak (eye-tracking)

Video of PoseSonic (upper-body tracking)

Video of EchoSpeech (face-tracking)

Thoughts on the Vision Pro, VisionOS and AR/VR

These are my collected thoughts about the Vision Pro, visionOS and at least some of the future of mobile AR as a medium, written in no particular order. I’ve been very interested in this device, how it is being handled by the news media, and how it is broadening and heightening our expectations about augmented reality as those who can afford it apply it in “fringe” venues (i.e., driving, riding a subway, skiing, cooking). I also have thoughts about whether we really need that many optical lenses/sensors, how Maps software could be used in mobile smartglasses AR, and what VRChat-like software could look like in AR. This is disjointed because I’m not having the best time in my life right now.

Initial thoughts

These were mostly written around February 1.

  • The option to use your eyes + pinch gesture to select keys on the virtual keyboard is an interesting way to type out words.
    • But I’ve realized that this should lead, hopefully, to a VR equivalent of swipe-typing on iOS and Android: holding your pinch while you swipe your eyes quickly between the keys before letting go, and letting the software determine what you were trying to type. This can give your eyes even more of a workout than they’re already getting, but it may cut down the time in typing.
    • I also imagine that the mouth tracking in visionOS could allow for the possibility of reading your lips for words without having to “listen”, so long as you are looking at a microphone icon. Or maybe that may require tongue tracking, which is a bit more precise.
  • The choice to have menus pop up to the foreground in front of a window is also distinctive from the QuestOS.
  • The World Wide Web in VR can look far better. This opens an opportunity for reimagining what Web content can look like beyond the WIMP paradigm, because the small text of a web page in desktop view may not cut it.
    • At the very least, a “10-foot interface” for browsing the web in VR should be possible and optional.
  • The weight distribution issue will be interesting to watch unfold as the devices go out to consumers. 360 Rumors sees Apple’s deliberate choice to load the weight on the front as a fatal flaw that the company is too proud to resolve. Might be a business opportunity for third party accessories, however.

Potential future gestures and features

Written February 23.

The Vision Pro’s gestures show an increase in options for computing input beyond pointing a joystick:

  • Eye-tracked gaze to hover over a button
  • Single quick Pinch to trigger a button
  • Multiple quick pinches while hovering over keys in order to type on a virtual keyboard
  • Dwell
  • Sound actions
  • Voice control

There are even more possible options for visionOS 2.0 both within and likely outside the scope of the Vision Pro’s hardware:

  • My ideas
    • Swiping eye-tracking between keys on a keyboard while holding a pinch in order to quickly type
    • Swiping a finger across the other hand while gazing at a video in order to control playback
    • Scrolling a thumb over a finger in order to scroll up or down a page or through a gallery
    • Optional Animoji, Memoji and filters in visionOS Facetime for personas
    • Silent voice commands via face and tongue tracking
  • Other ideas (sourced from this concept, these comments, this video)
    • Changing icon layout on home screen
    • Placing app icons in home screen folders
    • Ipad apps in Home View
    • Notifications in Home View sidebar
    • Gaze at iphone, ipad, Apple Watch, Apple TV and HomePod to unlock and receive notifications
    • Dock for recently closed apps
    • Quick access to control panel
    • Look at hands for Spotlight or Control Center
    • Enable dark mode for ipad apps
    • Resize ipad app windows to create desired workspace
    • Break reminders, reminders to put the headset back on
    • Swappable shortcuts for Action button 
    • User Profiles
    • Unlock and interact with HomeKit devices
    • Optional persistent Siri in space
    • Multiple switchable headset environments 
    • Casting to iphone/ipad/Apple TV via AirPlay
    • (realtime) Translate
    • Face detection
    • Spatial Find My
    • QR code support
    • Apple Pencil support
    • Handwritten notes detection
    • Widget support
    • 3D (360) Apple maps 
    • 3D support
    • Support for iOS/ipadOS keyboard

VRChat in AR?

Written February 23.

  • What will be to augmented reality what VRChat is to VR headsets and Second Life to desktops?
    • Second Life has never been supported by Linden Labs on VR headsets
    • No news or interest from VRChat about a mixed reality mode
    • (Color) Mixed reality is a very early, very open space
    • The software has yet to catch up
    • The methods of AR user input are being fleshed out
    • The user inputs for smartphones and VR headsets have largely settled
    • Very likely that AR headset user input will involve more reading of human gestures, less use of controllers
  • But what could an answer to VRChat or Second Life look like in visionOS or even Quest 3?
    • Issues
      • VRChat (VR headset) and Second Life (desktop) are about full-immersion social interaction in virtual reality
      • Facetime-like video chat with face-scanned users in panels is the current extent
      • Hardware weight, cost, size all limit further social avatars
      • Device can’t be used outside of stationary settings as per warranty and company policy
      • Lots of limitations to VRChat-like applications which involve engagement with meatspace
  • What about VRChat-like app in full-AR smartglasses?
    • Meeting fellow wearers IRL who append filters to themselves which are visible to others
    • Geographic AR layers for landmarks
    • 3D AR guided navigation for maps
    • Casting full personal view to other stationary headset/smartglass users
    • Having other users’ avatars visit you at a location and view the location remotely but semi-autonomously

Google Maps Immersive View

Written back on December 23.

Over a year and a half ago, Google announced Immersive View, a feature of Google Maps which would use AI tools like predictive modeling and NeRF fields to generate 3D images from Street View and aerial images of both exteriors and interiors of locations, as well as generate information and animations about locations from historical and environmental data for forecasts like weather and traffic. Earlier this year, they announced an expansion of Immersive View to routes (by car, bike or walk).

This, IMO, is one of Google’s more worthwhile deployments of AI: applying it to mashup data from other Google Maps features, as well as the library of content built by Google and third-party users of Google Maps, to create more immersive features.

I just wonder when they will apply Immersive View to Google Earth.

Granted, Google Earth already has had 3D models of buildings for a long time, initially with user-generated models in 2009 which were then replaced with autogenerated photogrammetric models starting in 2012. By 2016, 3D models had been generated in Google Earth for locations, including their interiors, in 40 countries, including locations in every U.S. state. So it does seem that Immersive View brings the same types of photogrammetric 3D models of select locations to Google Maps.

The differences between Immersive View and Google Earth seem to be the following:

  • animations of moving cars simulating traffic
  • predictive forecasts of weather, traffic and busyness outward to a month ahead, with accompanying animation, for locations
  • all of the above for plotted routes as well

But I think there is a good use case for the idea of Immersive View in Google Earth. Google touts Immersive View in Maps as “getting the vibe” of a location or route before one takes it. Google Earth, which shares access to Street View with Google Maps, is one of a number of “virtual globe” apps made to give cursory, birds-eye views of the globe (and other planetary bodies). But given the use of feature-rich virtual globe apps in VR headsets like Meta Quest 3 (see: Wooorld VR, AnyWR VR, which both have access to Google Earth and Street View’s data), I am pretty sure that there is a niche overlap of users who want to “slow-view” Street View locations and routes for virtual tourism purposes without leaving their house, especially using a VR headset.

But an “Immersive View” for Google Earth and associated third-party apps may have go in a different direction than Immersive View in Maps.

The AI-driven Immersive View can easily fit into Google Earth as a tool, smoothing over more of the limitations of virtual globes as a virtual tourism and adding more interactivity to Street View.

Sonar+AI in AR/VR?

Written around February 17.

Now if someone can try hand-tracking, or maybe even eye-tracking, using sonar. The Vision Pro’s 12 cameras (out of 23 total sensors) need at least some replacement with smaller analogues:

  • Two main cameras for video and photo
  • Four downward, 2 TrueDepth and 2 sideways world-facing tracking cameras for detecting your environment in stereoscopic 3D
  • four Infrared internal tracking cameras that track every movement your eyes make, as well as an undetermined number of infrared cameras outside to see despite lighting conditions
  • LiDAR
  • Ambient light sensor
  • 2 infrared illuminators
  • Accelerometer & Gyroscope

Out of these, perhaps the stereoscopic cameras are the best candidates for replacement with sonar components.

I can see hand-tracking, body-tracking and playspace boundary tracking being made possible with the Sonar+AI combination.

Fediverse Integration via ActivityPub in the News

The question is: how much money can they spare for integration? 

Decentralization apparently costs money if your site is built around centralized interaction. But, IMO:

  • Smaller sites benefit in the long run because they can focus on taking care of those users who care to use their instance.
  • Larger sites benefit by better moderation of content without as many users complaining about censorship. 

Better to shoot that shot now rather than wait and wither on the vine.

The original comic took place in Tokyo, but to make it their own, Hall and Williams decided to set the story in a brand new yet familiar city, melding the original locale with a near-future San Francisco. “It’s a very high-tech city that blends Eastern and Western culture, so we wanted it to be a mashup, just like the movie is a mashup between Disney and Marvel,” says Hall.

The resulting animated metropolis—which truly is its own character in the film, although people say that a lot—is a celebration of futuristic urban life and the high-tech culture that drives its residents. And it’s brought to life thanks to several new animation technologies developed in-house by Disney itself.

via A Tour of ‘San Fransokyo,’ the Hybrid City Disney Built for Big Hero 6.

I’m rather enamored of the trailers, and the city looks freaking amazing so far. I want to see it.